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ABSTRACT 


Recent  technological  advances  in  the  design  and  manufacturing  of  night  vision 
multispectral  sensors  now  allow  spatially  registered  imagery  provided  by  each  of  the 
sensors  to  be  combined  within  a  single  fused  image  for  display  to  an  end  user.  The 
product  is  a  multispectral  false-colored  rendering  of  the  imaged  scene.  The  use  of  false 
color  in  fused  imagery  may  facilitate  object  recognition,  providing  contour  information  of 
the  objects  present  in  the  scene,  but  incongruently  colored  fused  imagery  may  be 
disruptive  of  perceptual  performance.  This  study  investigated  if  the  use  of  false  color 
imagery  compared  to  natural  color  imagery  was  helpful  or  not  in  object  recognition. 
Subjects'  reaction  times  (RTs)  and  error  rates  were  measured  in  a  standard  naming  task. 
Stimuli  consisted  of  photographs  of  food  objects  that  had  been  manipulated  in  color 
(natural  color,  false  color,  natural  grayscale,  and  false  grayscale)  and  noise  (three  levels). 
The  results  of  the  experiment  showed  similar  differences  in  RTs  between  color  images 
(natural  or  false)  and  their  grayscale  counterparts  at  different  levels  of  noise,  indicating 
that  both  color  conditions  were  similarly  helpful  in  object  recognition.  These  results  give 
an  indication  that  false  color  may  be  useful  in  multispectral  sensors  based  on  its 
facilitation  of  image  segmentation  with  shape  degraded  images. 
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EXECUTIVE  SUMMARY 


Current  night  vision  devices  (NVD’s)  used  in  military  operations,  such  as  night 
vision  goggles  (NVG)  and  forward  looking  infrared  (FLIR)  systems,  were  designed  to 
allow  operations  in  low  visibility  conditions.  New  military  tactics  require  demanding 
capabilities  that  current  NVD’s  are  just  partially  able  to  accomplish. 

Infrared  (IR)  systems  sense  radiated  energy  detecting  thermal  differences  between 
an  object  and  its  background.  Image  intensifier  (I^)  sensors  amplify  reflected  moonlight 
and  starlight  taking  advantage  of  nighttime  illumination  conditions.  Because  of  this 
response  to  widely  separated  wavebands  within  the  electromagnetic  (EM)  spectrum,  each 
sensor  suffers  disadvantages  that  the  other  does  not,  which  can  change  depending  on  the 
atmospheric  and  environmental  conditions.  But,  nevertheless,  both  current  sensing 
modalities  seem  to  be  complementary.  Accordingly,  fusing  the  imagery  originated  in 
these  two  complementary  sensors  into  a  single  display  may  result  in  equal  or  better 
operator  performance  compared  to  the  two  single  band  sensor  imagery  alone.  This 
technique  is  known  as  dual-band  sensor  fusion. 

Currently  there  is  no  field  capacity  to  combine  the  best  attributes  of  both  sensors 
into  a  single  fused  image.  Recent  experimental  advances  in  sensoring  and  data  display 
have  permitted  good  progress  in  real  time  image  fusion  and  display  of  multispectral 
sensors  in  either  monochrome  or  synthetic  chromatic  form. 

The  image  processing  challenge  is  to  generate  an  intuitively  meaningful  color 
image  on  a  display  for  a  human  viewer.  Algorithms  to  perform  this  function  in  an 
optimum  manner  are  currently  under  development.  Since  neither  sensor  is  in  the  visible 
waveband,  the  artificial  color  mappings  produced  by  some  fusion  algorithms  will 
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generally  produce  false-color  imagery  whose  chromatic  characteristics  do  not  correspond 
in  any  intuitive  or  obvious  way  to  those  of  a  scene  viewed  under  natural  photopic 
illumination.  To  the  degree  that  human  perception  relies  on  stored  knowledge  of  objects’ 
chromatic  characteristics,  false  color  images  may  be  disruptive  of  perceptual 
performance,  making  colored  sensor  fusion  unhelpful  in  object  recognition. 

The  reason  for  using  color  in  fused  imagery  is  based  on  the  assumption  that  color 
(natural  or  artificial)  facilitates  image  segmentation,  providing  contour  information  about 
the  individual  objects  present  in  that  scene  as  a  way  to  achieve  target  detection.  Past 
psychophysical  research  has  been  equivocal  in  determining  what  utility,  if  any,  sensor 
fusion  has  for  human  performance.  Research  in  this  field  has  been  inconclusive  due  to 
differing  experimental  methodologies  used  in  these  studies. 

In  order  to  measure  the  effectiveness  of  sensor  fusion  devices  in  enhancing  the 
night  capabilities  of  military  operators  over  currently  employed  systems  detailed 
exploration  in  the  area  of  human  factors  was  required. 

The  objective  of  this  thesis  was  to  quantitatively  assess  the  role  of  natural  and 
artificial  color  in  object  recognition  when  shape  information  is  degraded,  investigating 
whether  and  how  false  color  might  be  useful  in  multi-band  fused  imagery.  Digital 
photographs  of  natural  objects  (fiuits  and  vegetables)  were  presented  as  natural  and  false 
color  images,  together  with  their  gray  scale  counterparts,  degraded  by  different  levels  of 
noise,  and  compared  these  imges  in  a  standard  naming  task,  trying  to  emulate  imagery 
generated  by  multispectral  devices.  Two  precise  measures  of  visual  ability  that  are 
critical  to  the  military,  reaction  time  (RT)  and  rate  of  accuracy  in  target  detection,  were 
measured. 
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Natural  color  might  facilitate  object  recognition  in  either  or  both  of  two  ways;  by 
facilitating  scene  segmentation  and  by  allowing  access  to  stored  color  knowledge.  In  the 
presence  of  false  colored  images,  recognition  might  be  disrupted,  because  the  access  to 
stored  knowledge  is  denied  and  participants  would  rely  just  in  color  contrast  as  a  way  to 
reach  object  recognition  through  scene  segmentation.  It  was  hypothesized  that  shorter 
RTs  and  greater  accuracy  rates  would  occur  within  the  natural  color  images  across  all 
levels  of  noise  and  the  difference  in  RTs  between  natural  color  and  false  color  images 
would  be  largest  in  the  conditions  with  the  greatest  amount  of  noise.  The  longest  RTs  and 
greatest  error  rates  were  expected  within  the  grayscale  images,  because  participants 
would  not  be  able  either  to  accomplish  scene  segmentation  or  to  access  stored  knowledge 
during  the  object  recognition  task.  Intermediate  results  would  be  achieved  by  false  color 
images,  due  to  the  possibility  at  least  to  fulfill  scene  segmentation,  originated  by  the 
presence  of  color. 

As  a  result  of  the  analysis  conducted  trying  to  assess  the  benefit  of  using  color  in 
object  recognition,  it  can  be  concluded  that  both  natural  and  false  hue  conditions  resulted 
equally  beneficial  in  the  task  accomplished  during  the  experiment.  There  was  no 
evidence  of  false  color  as  a  disruptive  factor  during  this  task,  and  both  natural  and  false 
hue  were  similarly  useful  at  different  levels  of  image  degradation.  The  reason  for  this 
conclusion  is  based  on  the  assumption  that  participants  conducted  a  bottom-up  process 
during  the  object  recognition  task,  making  use  of  color  (natural  or  false)  to  achieve  image 
segmentation. 
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L  INTRODUCTION 


Current  night  vision  devices  (NVDs)  used  in  military  operations,  such  as  night 
vision  goggles  (NVG)  and  forward  looking  infrared  (FLIR)  systems,  were  designed  to 
allow  operations  in  low-visibility  conditions.  New  military  tactics  require  demanding 
capabilities  that  current  NVDs  are  just  partially  able  to  accomplish.  Greater  target 
discrimination  from  decoys  and  background  clutter  is  needed,  together  with  greater 
display  resolution,  adequate  magnification  properties,  and  larger  fields  of  view  (Krebs, 
Scribner,  Miller,  Ogawa  &  Schuler,  1998;  Marine  Corps  CDC,  1995).  By  combining 
information  from  multiple  single-band  sources  within  a  unitary  display,  researchers  hope  to 
overcome  perceptual  limitations  inherent  in  the  images  provided  by  various  electro-optical 
sensors  singly  (Sinai,  McCarley,  Krebs  &  Essock,  1999a). 

Infrared  (IR)  systems  sense  radiated  energy  detecting  thermal  differences  between 
an  object  and  its  background.  Image  intensifier  (I^)  sensors  amplify  reflected  moonlight 
and  starlight  by  taking  advantage  of  nighttime  illumination  conditions.  Because  of  this 
response  to  widely  separated  wavebands  within  the  electromagnetic  (EM)  spectrum,  each 
sensor  maintains  and  suffers  disadvantages  that  the  other  does  not,  which  can  change 
depending  on  the  atmospheric  and  environmental  conditions  (Sinai  et  al.,  1999a).  For 
example,  resolution  is  better  in  f  sensors,  but  contrast  between  heat-emitting  objects  and 
their  surroundings  can  be  better  determined  by  IR  sensors  (Sinai  et  al.,  1999a). 

Limitations  in  each  of  the  sensing  modalities  can  sometimes  be  disorienting  by  creating 
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visual  illusions  (Crowley,  Rash  &  Stephens,  1 992),  while  alternating  between  these 
modalities  can  be  difficult,  confusing  and  distracting  (Rabin  &  Wiley,  1994). 

Nevertheless,  both  current  sensing  modalities  seem  to  be  complementary.  Accordingly, 
fiismg  the  imagery  originated  in  these  two  complementary  sensors  into  a  single  display 
may  result  in  equal  to  or  better  operator  performance  compared  to  the  two  single-band 
sensor  imagery  alone.  This  technique,  known  as  dual-band  sensor  fusion,  could  also 
provide  scene  information  not  present  in  either  single  band  image  alone  by  deriving 
emergent  information  based  on  the  difference  between  the  sensors  (Sinai  et  al.,  1999a). 

The  contrast  available  in  a  fused  image  is  often  displayed  as  a  monochrome  or  gray 
scale  image  (Therrien,  Scrofani  &  Krebs,  1997;  Peli,  Peli,  Ellis  &  Stahl,  1999). 

Techniques  developed  to  introduce  synthetic  color  to  fused  imagery  (Scribner,  Satyshur, 
Schuler  &  Kruer,  1996;  Waxman,  Gove,  Seibert,  Fay,  Carrick,  Racamato,  Savoye,  Burke, 
Reich,  McGonagle  &  Craig,  1996;  Scribner,  Warren,  Schuler,  Satyshur  &  Kruer,  1998; 
Krebs,  McCarley,  Kozek,  Miller,  Sinai  &  Werblin, 1999a),  attempting  to  provide  additional 
information  through  color  contrast,  are  examples  of  the  emergent  information  originated 
by  sensor  fusion. 

For  a  human  operator,  the  multiple  sources  of  imagery  need  to  be  fused  and 
displayed  in  a  form  that  is  easy  and  natural  to  interpret,  improving  the  operator 
performance  (Peli  et  al.,  1999).  Currently  there  is  no  field  capacity  to  combine  the  best 
attributes  of  both  sensors  into  a  single  fused  image.  Recent  experimental  advances  in 
sensoring  and  data  display  have  permitted  good  progress  in  real  time  image  fusion  and 
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display  of  multispectral  sensors  in  either  monochrome  or  synthetic  chromatic  form 
(McDaniel,  Scribner,  Krebs,  Warren,  Ockman  &  McCarley,  1998). 

The  need  is  for  new  image  processing  techniques  to  combine  the  multispectral 
images  so  that  the  resultant  image  will  have  more  information  content  than  any  of  the 
original  images,  as  it  has  been  demonstrated  by  several  researchers  (Scribner  et  al.,  1998; 
Krebs  et  al.,  1999a;  Therrien  et  al.,  1997;  Waxman,  Aguilar,  Fay,  Ireland,  Racamato, 
Ross,  Carrick,  Gaove,  Seibert,  Saboye,  Reich,  Burke,  McGonagle  &  Craig,  1998).  This 
requires  studies  in  data  formatting  such  as  color-coding  or  object  enhancements  (e.g., 
towers  hanging  or  power  line  for  obstacle  avoidance)  (McDaniel  et  al.,  1998).  The  image 
processing  challenge  is  to  generate  an  intuitively  meaningful  color  image  on  a  display  for  a 
human  viewer  that  should  improve  the  operator  performance,  facilitating  discrimination  of 
objects  from  backgrounds  and  situational  awareness  by  means  of  scene  segmentation. 

Past  psychophysical  research  has  been  equivocal  in  determining  what  utility,  if  any, 
sensor  fusion  has  for  human  performance.  While  some  studies  have  found  a  significant 
advantage  for  fused  imagery  over  single  sensor  imagery  (Essock  et  al.,  1999a;  Toet, 
Ijspeert,  Waxman  &  Aguilar,  1997;  Waxman  et  al.,  1996),  others  have  not  (Steele  and 
Perconti,  1997;  Krebs  et  al.,  1998;  Essock,  Sinai,  McCarley,  DeFord  &  Srinivasan, 

1999b).  These  discrepant  results  can  be  attributed  to  the  differences  in  fusion  algorithms 
tested,  and  to  the  differences  in  the  psychophysical  tasks  employed  (Essock  et  al.,  1999b). 
It  is  not  so  obvious  that  sensor  fusion  is  going  to  be  beneficial  for  perceptual  performance 
(Sinai,  McCarley  &  Krebs,  1999b). 
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Since  neither  sensor  is  in  the  visible  waveband,  the  artificial  color  mappings 
produced  by  some  fusion  algorithms  will  generally  produce  false-color  imagery  whose 
chromatic  characteristics  do  not  correspond  in  any  intuitive  or  obvious  way  to  those  of  a 
scene  viewed  under  natural  photopic  illumination.  To  the  degree  that  human  perception 
relies  on  stored  knowledge  of  objects’  chromatic  characteristics,  false  color  images  may  be 
disruptive  of  perceptual  performance  (Sinai  et  al.,  1999b),  making  colored  sensor  fusion 
unhelpful  in  object  recognition. 

The  reason  for  using  color  in  fused  imagery  is  based  on  the  assumption  that  color 
(natural  or  artificial)  facilitates  image  segmentation  (Walls,  1942),  providing  contour 
information  about  the  individual  objects  present  in  that  scene  as  a  way  to  achieve  target 
detection.  It  should  also  be  considered  that  the  role  of  color  in  object  recognition  has  not 
been  determined  clearly  enough  either.  Past  research  in  this  field  has  been  inconclusive 
due  to  differing  experimental  methodologies  used  in  these  studies.  Several  tasks  and 
different  types  of  stimuli  were  presented  to  the  participants.  Observers  were  required  to 
recognize  or  identify  natural  or  manufactured  objects  presented  as  colored  or  achromatic 
photographs,  line  drawings,  artificially  colored  photographs,  etc.,  using  noise  or  blur  as 
image  degrading  factors  as  a  way  to  simulate  poor  resolution  conditions  (Wurm,  Legge, 
Isenberg  &  Luebker,  1993;  Ostergaard  and  Davidoff,  1985;  Biederman  and  Ju,  1988; 
Joseph  and  Proffitt,  1996). 

In  order  to  measure  the  effectiveness  of  sensor  fusion  devices  in  enhancing  the 
night  capabilities  of  military  operators  over  currently  employed  systems,  detailed 
exploration  in  the  area  of  human  factors  is  required. 
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This  thesis  is  focused  on  the  human  factors  of  sensor  fusion;  more  specifically, 
human  perception  of  natural  and  artifici^  color  images  similar  to  those  produced  by  sensor 
fusion  processes.  The  objective  of  this  thesis  is  to  quantitatively  assess  the  role  of  natural 
and  artificial  color  in  object  recognition  when  shape  information  is  degraded,  investigating 
whether  and  how  false  color  might  be  useful  in  multi-band  fused  imagery.  Digital 
photographs  of  natural  objects  (fiuits  and  vegetables)  were  presented  as  natural  and  false 
color  images,  together  with  their  gray  scale  counterparts,  degraded  by  different  levels  of 
noise,  and  comparing  them  in  a  standard  naming  task,  trying  to  emulate  imagery  generated 
by  multispectral  devices.  Two  precise  measures  of  visual  ability  that  are  critical  to  the 
military,  reaction  time  (RT)  and  rate  of  accuracy  in  target  detection,  were  measured. 

Natural  color  might  facilitate  object  recognition  in  either  or  both  of  two  ways;  by 
facilitating  scene  segmentation  (Walls,  1942)  and  by  allowing  access  to  stored  color 
knowledge  (Joseph  and  Proffitt,  1996).  In  the  presence  of  false  colored  images, 
recognition  may  be  disrupted,  because  the  access  to  stored  knowledge  is  denied  and 
participants  will  rely  just  in  color  contrast  as  a  way  to  reach  object  recognition  through 
scene  segmentation.  It  is  hypothesized  that  shorter  RTs  and  greater  accuracy  rates  will 
occur  within  the  natural  color  images  across  all  levels  of  noise  and  the  difference  in  RTs 
between  natural  color  and  false  color  images  will  be  largest  in  the  conditions  with  the 
greatest  amount  of  noise.  Faster  RTs  are  expected  within  the  natural  color  images 
because  participants  will  use  color  information  to  access  stored  knowledge  of  the  object’s 
chromatic  features,  and  they  vfill  be  able  also  to  fulfill  scene  segmentation.  Larger  effects 
of  natural  color  images  are  also  expected  in  the  conditions  with  higher  levels  of  noise 
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because  here,  since  the  objects’  shape  information  is  degraded,  subjects  may  be  expected 
to  rely  more  heavily  on  color  information  to  recognize  the  stimuli.  The  longest  RTs  and 
greatest  error  rates  are  expected  within  the  gray  scale  images,  because  participants  will  not 
be  able  either  to  accomplish  scene  segmentation  or  to  access  stored  knowledge  during  the 
object  recognition  task.  Intermediate  results  will  be  achieved  by  false  color  images,  due  to 
the  possibility  at  least  to  fulfill  scene  segmentation,  originated  by  the  presence  of  color.  If 
color  is  used  only  for  scene  segmentation  similar  effects  of  natural  color  and  false  color 
images  are  expected,  although  they  should  be  faster  and  more  accurate  than  the  effects 
originated  by  the  gray  scale  images. 
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n.  BACKGROUND 


A.  fflSTORY 

The  Vietnam  War  era  witnessed  the  great  expansion  of  the  injfrared  industry.  This 
industrial  development  was  motivated  by  the  inability  of  the  U.S.  forces  to  prevent  North 
Vietnamese  forces  from  conducting  night  operations  (Schwar2kopf,  1992). 

Since  the  post- Vietnamese  era,  all  military  high  value  platforms  possess  NVDs. 
These  systems,  that  use  specific  sensors  and  techniques  necessary  to  acquire  and  engage 
opposing  forces  during  low  visibility  or  nighttime  periods  under  adverse  warfare 
environments,  have  been  proven  effective  in  all  kind  of  combat  operations  (NVESD, 

1997).  However,  unanticipated  problems  have  arisen  while  utilizing  these  devices.  A 
human  unaided  perception  of  the  surroundings  at  night  is  vastly  different  when  observed 
with  NVDs  (Vargo,  1999).  The  user’s  lack  of  understanding  of  the  night  environment  and 
its  impact  on  the  NVDs  performance  has  caused  the  capabilities  of  these  devices  to  be 
exceeded,  resulting  in  numerous  mishaps  (Salvendy,  1997).  Also,  the  increasing 
sophistication  of  military  tactics  and  weapon  systems  require  enhanced  capabilities  that 
current  NVDs  are  not  able  to  accomplish  (Krebs  et  al.,  1998).  Multiband  image  fusion 
devices,  currently  under  development,  are  supposed  to  solve  several  of  the  existing 
limitations  of  the  infrared  systems  and  to  achieve  the  tasks  required  by  modem  nighttime 
warfare.  In  these  new  devices,  the  information  provided  by  each  of  the  sensors  in  the 
system  is  combined  into  a  single  fused  image  before  being  displayed  to  an  end  user.  The 
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resulting  image  is  a  multispectral  false-colored  rendering  of  the  imaged  scene.  The 
expected  advantage  of  fused  images  is  not  only  choosing  the  most  helpful  effects  of  each 
of  the  fusioned  sensors,  but  also  obtaining  additional  information  based  on  the  difference 
between  the  sensors. 

This  study  will  investigate  one  of  the  unsolved  problems  of  these  new  NVD’s:  the 
use  of  color  in  the  resulting  fused  imagery.  A  generic  presentation  of  how  the  human 
visual  system  (HVS)  accomplishes  the  perception  of  color  will  provide  a  basic 
understanding  of  the  problems  related  to  adding  color  to  multisensor  fused  systems.  A 
general  description  and  characteristics  of  single-band  sensors  currently  in  use  are  provided 
to  aid  in  the  comprehension  of  image  fusion  techniques  and  future  multisensor  devices. 
Previous  research  involving  the  role  of  color  in  object  recognition  is  summarized,  along 
with  several  studies  that  investigate  and  develop  different  techniques  of  color  fusion. 

B.  PERCEPTION  OF  COLOR  IN  THE  HUMAN  VISUAL  SYSTEM 

1.  Electromagnetic  (EM)  Spectrum 

The  first  characteristic  of  the  night  environment  relevant  to  an  understanding  of 
night  vision  technology  is  the  EM  spectrum  of  the  night  sky  and  its  relationship  to  the  eye 
and  to  the  NVD  s.  NVD  s  allow  us  to  exploit  a  greater  portion  of  the  EM  spectrum  as 
compared  to  the  human  eye.  This  issue  can  be  seen  in  Fig.  1 . 
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and  to  the  NVD’s.  NVD’s  allow  us  to  exploit  a  greater  portion  of  the  EM  spectrum  as 
compared  to  the  human  eye.  This  issue  can  be  seen  in  Fig.  1. 

The  NVG’s,  FLIR,  and  most  night  imaging  devices,  including  the  human  eye,  are 
sensitive  to  different  wavelengths  of  the  EM  spectrum.  These  frequency  bands  are 
similar  in  nature  and  their  relationship  can  be  clearly  expressed  by  their  position  in  the 
EM  spectrum,  shown  in  Fig.  2. 

As  it  can  be  seen,  the  optical  band  covered  by  visible  light  is  a  relatively  small 
portion  of  the  entire  spectrum.  Visible  and  near  IR  light  are  considered  to  be  reflected 
energy,  while  the  thermal  bands  in  the  mid  and  far  infrared  are  primarily  radiated  energy. 

2.  Human  Visual  System 

The  human  visual  system  (HVS)  is  sensitive  to  radiation  whose  wavelength  is  in 
the  0.4  to  0.7  micron  range  of  the  EM  spectrum.  When  a  combination  of  these  radiations 
reach  the  human  eye,  neural  processing  of  these  signals  will  originate  a  psychological 
reaction  called  color  vision.  Visible  radiation  received  by  the  HVS  may  come  directly 
from  a  light  source,  but  is  usually  reflected  by  object  surfaces  before  reaching  our  eyes. 

Three  primary  perceptual  dimensions  of  these  radiations  combine  to  define  our 
psychological  perception  of  color:  hue  (wavelength  of  the  radiation),  saturation  (hue 
purity)  and  lightness  (intensity  of  the  light  source). 

Hue  is  the  reaction  to  wavelengths  ranging  from  0.4  microns  (violet)  to  0.7 
microns  (red).  As  Newton  demonstrated,  white  light  really  consists  of  a  combination  of 
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Figure  1:  Spectral  response  of  eye,  image  intensifiers  and  IR  sensors  (MAWTS-1, 
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Figure  2:  Visible  Color  Spectrum  (Matlin  &  Foley,  1997) 
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different  colored  lights.  Each  wavelength  included  in  this  range  after  being  processed  by 
the  HVS,  produces  the  perception  of  a  specific  color,  as  it  is  shown  in  Fig.  3. 

One  way  to  organize  colors,  proposed  by  Newton  in  1704,  is  in  terms  of  a  color 
wheel.  The  outside  of  the  wheel  represents  monochromatic  colors  (those  that  can  be 
produced  by  a  single  wavelength)  plus  non-spectral  hues  (those  that  cannot  be  described 
in  terms  of  a  single  wavelength  from  the  visual  spectrum).  Similar  hues  are  located  near 
one  another. 

In  addition  to  hue,  our  experience  of  color  is  characterized  by  lightness  and 
saturation.  Objects  vary  in  the  amount  of  light  they  reflect  from  their  surfaces.  Lightness 
is  the  apparent  reflectance  of  a  color.  It  describes  our  psychological  reaction  to  the 
physical  characteristic,  reflectance.  Objects’  lightness  vary  from  very  dark  (black)  to  very 
light  (white),  with  other  shades  of  lightness  in  between  (Matlin  &  Foley,  1997). 

Another  characteristic  of  color  is  saturation,  our  psychological  reaction  to  the 
physical  characteristic,  purity.  Saturation  measures  the  amount  of  white  light  added  to  a 
hue.  A  saturated  hue,  lying  on  the  perimeter  of  the  color  wheel,  no  white  light  added,  is 
perceived  as  a  deep  hue.  An  unsaturated  hue  will  be  closer  to  the  center  of  the  wheel  and 
is  perceived  as  a  much  lighter  hue.  Completely  unsaturated  colors  are  called  achromatic 
or  neutral,  and  they  are  perceived  as  white,  shades  of  gray  and  black,  depending  on  their 
amount  of  lightness.  These  colors  are  represented  in  the  center  of  the  color  wheel,  as  it  is 
shown  in  Fig.  4. 

The  mixture  of  monochromatic  hues  produces  the  perception  of  the  whole 
diversity  of  colors  in  the  human  visual  spectrum.  Hues  can  be  mixed  in  two  different 
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Figure  3:  Wavelength  composition  of  sunlight  and  artificial  light. 
(Matlin  &  Foley,  1997). 
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ways.  Subtractive  mixing  involves  a  single  source  that  passes  through  filters  or  falls  in 
pigments.  Parts  of  the  spectrum  are  absorbed  or  subtracted,  as  it  is  represented  in  Fig.  5. 
Additive  mixture  is  accomplished  by  adding  or  combining  colored  lights  of  different 
wavelengths,  as  it  is  represented  in  Fig.  6.  The  result  of  color  mixing  can  be  predicted  by 
using  the  color  wheel.  A  graphic  explanation  of  this  method  is  shown  in  Fig.  7. 

Color  television  is  an  example  of  additive  mixtures.  The  screen  consists  of  many 
tiny  dots.  When  irradiated  they  glow  blue,  green  or  red.  All  different  colors  are  produced 
by  combination  in  our  photoreceptors  of  the  different  lights  generated  in  each  screen  dot 
when  watched  from  an  appropriated  distance.  A  yellow  patch  is  really  composed  of  red 
and  green  dots  (see  Figure  4). 

Hue  wavelengths  are  not  evenly  arranged  around  the  periphery  of  the  wheel.  This 
distribution  is  necessary  to  place  complementary  hues  on  exactly  opposite  sides  of  the 
color  wheel.  Complementary  hues  are  those  whose  additive  mixtures  make  an 
achromatic  color  (shade  of  gray). 

By  means  of  any  of  these  two  techniques,  colored  light  reaches  the  human  visual 
system  (HVS),  producing  the  perception  of  color.  The  way  in  which  colored  light 
produces  the  perception  of  color  in  the  VHS  is  explained  by  two  theories,  each  of  them 
applied  to  different  levels  of  the  visual  processing  system.  Trichromatic  theory  explains 
the  way  in  which  the  input  signal  from  the  photoreceptors  is  combined  (Neitz,  Neitz  & 
Jacobs,  1993).  Opponent  process  theory  explains  how  the  information  provided  by  the 
photoreceptors  is  interpreted  by  the  neural  system  (DeValois  &  DeValois,  1975). 
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Mixture  of  blue  and  yellow  absorbs  painted  paper 
yellow,  orange,  red,  blue,  and  violet 


Figure  5:  Subtractive  mixtures  for  blue  paint  and  yellow  paint. 
(Matlin  &  Foley,  1997) 


Figure  6;  Additive  mixtures  for  blue  light  and  yellow  light  (Matlin  &  Foley,  1997) 
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The  Young-Helmholtz  Trichromatic  theory  assumes  that  humans  have  three  kinds 
of  color  receptors,  each  differentially  sensitive  to  light  from  a  different  part  of  the  visual 
spectrum.  These  receptors  are  called  “cones”  and  they  work  best  in  well-lit 
environments,  giving  rise  to  the  full  range  of  colors  (achromatic  and  chromatic).  There  is 
another  kind  of  receptors  in  the  human  retina.  These  are  the  “rods,”  which  work  best  in 
poorly  lit  environments  where  they  give  rise  to  the  perception  of  achromatic  colors. 

Visual  perception  research  has  established  that  the  three  kinds  of  cone  pigments 
have  different  but  overlapping  absorption  curves  (De  Valois  and  De  Valois,  1975),  each 
of  them  being  maximally  sensitive  to  a  different  wavelength  as  it  is  shown  in  Fig.  8. 

We  will  refer  to  these  three  kinds  of  cones  as  S  (short  wavelength),  M  (medium 
wavelength)  and  L  (long  wavelength)  based  on  the  wavelengths  to  which  they  are  most 
sensitive.  In  this  way,  human  visual  receptors  are  able  to  distinguish  the  wavelength  of 
an  incoming  signal,  because  it  will  activate  one  or  several  receptors  in  a  unique  pattern  or 
distribution  for  each  wavelength. 

Trichromatic  theory  by  itself  cannot  explain  all  the  color  phenomena.  Some 
mechanism  beyond  the  receptor  level  must  combine  the  information  from  the  cones  in  a 
complex  way.  Several  pieces  of  evidence  point  to  the  existence  of  separate  mechanisms 
for  red,  yellow,  green  and  blue.  How  do  these  four  mechanisms  arise  from  only  three 
cone  systems?  Human  sense  of  color  must  arise  from  additional  processing  of  the  input 
from  the  three-cone  system. 

Opponent-process  theory  (De  Valois  &  De  Valois,  1975)  covers  this  second  level 
of  visual  processing  system,  beyond  the  photoreceptors.  This  process  is  implemented 
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Figure  7;  Predicting  additive  mixtures.  (Matlin  &  Foley,  1997) 


Figure  8:  Absorption  curves  for  the  three  cone  pigments. 
(Matlin  &  Foley,  1997) 


by  means  of  the  neural  coimectors  among  photoreceptors  and  neurons  in  the  human  retina. 
De  Valois  &  De  Valois  (1975)  modeled  all  these  possible  connections  showing  how  both 
chromatic  and  achromatic  information  could  be  conveyed  through  identical  mechanisms 
and  it  also  illustrates  how  four  color  channels  could  arise  from  three  cone  systems  (Matlin 
&  Foley,  1997).  This  theory,  whose  development  is  far  beyond  the  scope  of  this  thesis,  is 
basic  in  the  development  of  several  fusion  algorithms  (Waxman,  Fay,  Gove,  Seibert, 
Racamato,  Carrick  &  Savoye,  1995). 

Another  characteristic  of  color  is  color  constancy.  Because  of  color  constancy, 
humans  tend  to  see  the  hue  of  an  object  as  staying  the  same  despite  changes  in  the 
wavelength  of  the  light  illuminating  the  object.  Variations  in  illumination  light  may  arise 
by  changes  in  the  intensity  or  in  the  composition  of  the  illumination  source.  Absolute 
color  constancy  would  be  obtained  if  an  object  appeared  to  be  the  same  color  regardless 
of  the  type  of  illumination  or  the  colors  of  nearby  objects  (Maloney,  1993).  Our 
perception  of  color,  then,  is  not  dependent  on  the  absolute  wavelengths  reaching  our 
retinas,  but  on  reflectance  relationships  among  objects  in  our  field  of  vision  QBrou, 

Sciascis,  Lindeln  &  Lettvyn,  1986).  Color  constancy  is  probably  not  maintained 
completely.  So,  human  color  perceptions  are  influenced  to  a  degree  by  the  nature  of  the 
illumination.  This  lack  of  consistency  for  the  intensity  of  reflected  light  required  the  HVS 
to  develop  a  variety  of  mechanisms  to  disentangle  the  contradictions  of  varying 
illumination  and  thereby  to  achieve  nearly  constant  color  perception  based  on  distal 
surface  reflectivity  (Matlin  &  Foley,  1997).  Based  on  this  characteristic  of  the  HVS,  color 
constancy  seldom  breaks  down  to  the  extent  that  an  observer  would  assign  two  different 
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color  names  to  the  same  object  just  because  of  changes  in  illumination  (Jameson  & 
Hurvich,  1989).  When  an  object  is  seen  under  different  illumination  conditions,  it  might 
look  subtly  different,  but  it  will  still  be  recognized  as  the  same  color.  A  major  limitation 
to  sensor  fusion  systems  is  that  these  mechanisms  cannot  be  duplicated  to  achieve  the 
same  constant  color  perception. 


C.  CURRENT  NIGHT  VISION  DEVICES 

Night  vision  devices  (NVDs)  enable  exploitation  of  the  night  environment  by  the 
NVD  user  by  processing  EM  bands  outside  the  human  visual  spectrum.  These  devices  do 
not  allow  perfect  vision  during  nighttime  operations,  but  they  do  enable  humans  to 
improve  their  performance  in  multiple  tasks  such  as  movement  on  foot  or  even  night 
attacks  using  sophisticated  weapon  systems,  both  land  based  or  airborne. 

Current  military  night  operations  are  enabled  through  imaging  in  the  visible-near 
infrared  band  (wavelengths  of  .57  to  .9  microns)  and  in  the  thermal  infrared  band 
(wavelengths  of  7  to  14  microns).  Fig.  1  shows  the  portions  of  the  EM  spectrum  covered 
by  NVDs.  Both  types  of  NVD’s  are  explained  in  more  detail  in  the  two  next  subsections. 
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1. 


Image  Intensifiers 


Image  intensifiers  (I^)  process  the  visible  and  near-infrared  spectrum  and,  much 
like  the  human  visual  system,  depend  almost  entirely  on  reflected  energy  from  scene 
illumination  (MAWTS-l,'  1995).  They  amplify  reflected  moonlight  and  starlight  (primarily 
yellow  through  near  infrared  light,  with  wavelengths  of  0.57  to  0.9  microns)  and  ambient 
light  produced  by  artificial  sources  of  illumination  (visible  light  wavelength  of  0.4  to  0.7 
microns). 

Visible  and  near-infrared  imagery  is  currently  provided  by  the  third  generation  of 
tubes.  The  five  major  components  of  an  t  tube  are  the  objective  lens,  the  photo  cathode, 
the  microchannel  plate,  the  phosphor  screen  and  the  eyepiece  lens.  Radiant  or  reflected 
optical  energy  received  at  this  device  is  focused,  turned  into  electric  energy,  amplified  and 
turned  again  into  green  -yellow  light  in  the  0.56  microns  range,  matching  the  peak 
sensitivity  of  photopic  human  vision.  Finally  it  is  inverted  and  focused  before  reaching  the 
operator  eye.  Image  intensified  imagery  is  usually  displayed  in  night  vision  goggles 
(NVG’s). 

The  ratio  of  the  brightness  of  the  image  at  the  output  of  the  eyepiece  lens  over  the 
luminance  of  the  light  entering  the  objective  lens  is  called  the  gain  of  the  I^  tube.  The 
variants  of  the  Gen  IE  NVG’s  currently  used  have  a  gain  of 25,000,  a  substantial 
advantage  for  the  unaided  human  eye  in  the  night  environment  (MAWTS-l,  1995). 

Illumination,  expressed  in  lumens,  per  square  meter  (Im/m^)  or  lux,  measures  the 
amount  of  visual  energy  that  exists  in  a  specific  location.  Lunar  illumination  is  the  primary 
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energy  source  for  natural  illumination  in  the  night  sky  (MAWTS-1,  1995).  Additionally, 
stellar  phenomena  and  starhght  provide  certain  amount  of  illumination.  Figure  9  shows 
how  moonless  night  sky  illumination  almost  matches  the  peak  sensitivity  of  NVGs. 

Two  other  contributors  of  illumination  are  the  sun  and  artificial  sources.  The 
setting  sun  at  zero  degrees  below  the  horizon  is  too  bright  for  NVG  operations,  however, 
approximately  one  half  hour  after  sunset,  when  the  sun  has  lowered  to  seven  degrees 
below  the  horizon,  it  may  provide  useable  illumination  until  it  has  set  past  twelve  degrees. 
Artificial  lighting  such  as  street  lights  or  radio  tower  warning  lights  can  also  provide 
significant  illumination,  but  large  concentrations  of  artificial  illuminators  can  wash  out  the 
NVG  image. 


Figure  9:  Moonless  night  spectral  composition  (MAWTS-1, 1995) 
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The  atmosphere  is  the  most  important  environmental  factor  controlling  the 
performance  of  the  NVGs.  The  atmosphere  can  attenuate  the  light,  reducing  the  level  of 
energy  reaching  the  NVGs.  This  attenuation  can  occur  by  absorption  or  scattering  mainly 
due  to  the  fact  that  attenuation  by  refraction  is  almost  negligible.  NVGs  operate  by 
intensifying  light  energy  between  625  to  960  nanometers.  Any  attenuation,  either  before 
or  after  it  strikes  the  terrain,  will  effectively  reduce  the  usable  light  available  to  the  NVG 
and  thus  affect  the  resulting  image.  Attenuation  is  caused  by  impact  of  light  particles 
with  particles  larger  than  one  micron  in  length  such  as  water  vapor,  dust,  snow,  and  other 
natural  or  man-made  obscurants.  The  effect  of  these  particles  will  depend  very  much  on 
their  size  and  density,  but  all  of  them  will  affect  distance  estimation  and  depth  perception 
reducing  significantly  the  usefulness  of  these  devices  and  even  making  them  almost 
useless  during  adverse  atmospheric  conditions  (MAWTS-1, 1995). 

2.  Thermal  Infrared  Devices 

The  thermal  infrared  devices,  supported  by  several  kinds  of  forward-looking 
infrared  (FLIR)  imaging  devices,  convert  invisible  thermal  energy  from  the  far  infrared 
spectrum  into  a  visible  image.  PT^IR’s  generally  process  emissions  from  two  infrared 
bands,  midwave  (3  to  5  microns)  and  long  wave  (8  to  12  microns).  Infi'ared  energy 
(thermal  energy)  within  these  bandwidths  is  emitted  by  all  objects  with  a  temperature 
above  absolute  zero  (-273  degrees  Celsius). 
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Natural  thermal  energy  is  produced  when  objects  that  have  previously  absorbed 
thermal  energy  from  IR  sources,  such  as  the  sun  or  warm  air  currents,  start  radiating  this 
energy.  Another  source  of  thermal  energy  is  from  man-made  objects  such  as  the  heat 
radiated  as  a  result  of  the  friction  from  moving  parts  in  mechanical  devices  (MAWTS-1, 
1995).  It  is  important  to  note  that  most  man-made  objects  emit  in  the  8  to  12  micron 
band,  hence  the  military  interest  in  LWIR  sensors  (Sampson,  1996). 

In  order  to  measure  the  thermal  energy  radiated  by  an  object  we  define  emissivity 
(E)  as  the  ratio  of  an  object’s  abUity  to  emit  thermal  energy  at  a  certain  temperature  over 
that  of  a  black  body  at  the  same  temperature.  “Blackbody”  is  defined  as  the  perfect 
absorber  of  thermal  energy  and  therefore  also  a  perfect  emitter,  with  an  efficiency  of  unity 
(MAWTS-l,  1995).  Other  factors  impacting  emissivity  are  material  composition,  ambient 
temperature  and  the  object’s  temperature  and  geometry.  Most  natural  objects  have  a  high 
emissivity  and  therefore  a  majority  of  their  thermal  signature  is  from  self-emission. 
Conversely,  objects  with  low  emissivity  have  a  corresponding  high  reflectivity  and 
therefore  reflect  thermal  energy  of  their  surroundings. 

Thermal  energy  emitted  by  an  object,  whether  it  is  internally  generated  or  reflected 
by  another  source,  determines  its  thermal  signature.  It  is  primarily  the  difference  among 
thermal  signatures  of  objects  that  defines  the  thermal  scene  (Sampson,  1996).  An 
important  measure  of  performance  of  a  FLIR  is  “delta  T”  or  the  temperature  difference  of 
an  object  and  its  background  (MAWTS-1,  1995).  The  cychc  heating  and  cooling  of  the 
terrain  causes  the  diurnal  cycle  of  temperature  differences  between  objects  of  different 
thermal  mass  and  inertia. 
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Figure  10  shows  the  diurnal  cycle  of  temperature  differences  for  an  armored 
vehicle  and  the  background  terrain.  From  the  graph  one  can  visualize  the  negative 
thermal  contrast  (object  cooler  than  background)  of  the  armored  vehicle  on  a  clear  surmy 
day  and  the  positive  thermal  contrast  (object  warmer  than  background)  of  the  armored 
vehicle  at  night.  Because  of  the  positive  thermal  contrast,  FLIR’s  will  be  able  to  detect 
targets  against  the  background  during  night  periods. 


Figure  10:  Diurnal  cycle  example  (MAWTS-1, 1995) 


Attenuation  of  thermal  energy  after  it  leaves  its  source  can  occur  by  absorption  or 
scattering.  Atmospheric  vapor  or  humidity  is  the  most  significant  absorber  of  thermal 
energy.  Li  very  hot  and  humid  climates,  the  high  amount  of  absorption  may  literally 
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render  the  FLIR  useless  (MAWTS-1,  1995).  Molecular  scattering  occurs  when  thermal 
energy  particles  strike  other  particles  present  in  the  atmosphere,  as  nitrogen,  oxygen, 
water  vapor  and  carbon  dioxide.  Because  of  these  strikes,  thermal  energy  can  be  scattered 
in  different  directions  making  it  difficult  to  reach  the  IR  sensor  (MAWTS-1, 1995). 

FLIR  systems  are  complex  and  their  detailed  composition  is  far  beyond  the  scope 
of  this  thesis.  The  basic  elements  of  this  device  are  the  infrared  sensor,  the  signal 
processor  and  the  display  unit.  The  detector  array  is  composed  of  semiconductive 
material,  which  turns  8  to  12  microns  heat  energy  into  analog  electrical  output  to  the 
signal  processor.  The  signal  processor  provides  the  special  signal  functions  required  to 
stabilize  and  enhance  the  analog  output  from  the  detector  array.  The  signal  from  the 
processor  is  transformed  to  an  image  through  the  use  of  a  cathode  ray  tube  (CRT)  and 
displayed  on  a  “heads  down  display”  (HDD),  cockpit  “heads  up  display”  (HUD)  or 
“helmet  mounted  display”  (HMD)  (MAWTS-1,  1995).  Current  FLIR  technology  is 
centered  on  the  first  generation  (Gen  I)  FLIR  thermal  imaging  device.  The  U.S.  Army 
began  integration  of  second  generation  FLIR’s  into  new  and  existing  weapon  systems  to 
maximize  U.S.  forces  advantage  on  the  battlefield  (NVESD,  1997).  IR  imagery  is 
displayed  using  a  variety  of  forward  looking  infrared  (FLIR)  imaging  devices  (both 
scanners  and  IR  focal  plane  arrays)  displayed  on  monochrome  phosphor  monitors,  the 
cockpit  heads-up  display,  or  combiner  optics. 
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D.  mulhspectral  image  fusion  devices 


The  two  sensing  modalities  currently  used  for  night  vision  purposes  (I^  and  IR) 
have  been  improved  due  to  increases  in  sensor  detection  ranges  and  display  resolution,  but 
they  still  have  their  own  limitations.  devices  need  reflected  light  for  detection  and  IR 
devices  must  be  able  to  detect  thermal  contrasts  among  the  objects  in  the  scene  (Vargo, 
99).  Recent  advances  in  signal  processing  have  permitted  the  possibility  of  combining  the 
best  attributes  of  the  emissive  radiation  sensed  by  the  thermal  sensor  and  the  reflected 
radiation  sensed  by  the  image  intensifier  sensor  into  a  single  “fused”  image  (Steele  and 
Perconti,  1997). 

Long  wave  IR  and  I^  sensors  are  good  candidates  for  image  fusion.  The  thermal 
contrast  between  relatively  high  emissivity  objects  and  the  background  is  a  good  indicator 
of  potential  targets,  obstacles  and  waj^oints.  The  inability  to  see  details  in  areas  that  have 
relatively  poor  thermal  contrast,  caused  by  low  emissivity  differences,  might  be  greatly 
compensated  for  by  fusing  with  an  I^  sensor.  Given  the  proper  illumination  conditions,  the 
“visible”  contrast  can  provide  very  useful  cues  that  are  independent  of  thermal  conditions. 
I^  sensors  might  also  aid  in  producing  a  natural  representation  of  the  scene  due  to  the 
proximity  of  this  wave  band  to  that  of  the  visible  (0.4  to  0.7  microns)  waveband  (Steele 
and  Perconti,  1997). 

Recent  technological  advances  in  the  production  of  multi-spectral  sensors  now 
allows  I^  and  IR  imagery  to  be  mapped  to  a  high  speed  processor  where  it  can  be  fused 
and  displayed  to  an  end  user  (McDaniel  et  al.,  1998).  Some  advantages  of  combining 
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multiple  spectral  imagery  into  a  single  display  might  be;  (a)  reduced  cost,  space, 
computational  processing  and  weight  requirements  form  combined  resources,  (b)  reduced 
operator  workload  by  limiting  the  need  to  alternate  between  the  two  sensors,  (c)  improved 
object  search,  detection  and  recognition. 

Numerous  fusion  techniques  have  been  developed  during  the  last  years  that 
produce  both  monochromatic  and  color  imagery  (Toet,  van  Ruyven  &  Valeton,  1989; 
Scribner,  Satyshur  &  Kruer,  1993;  Palmer,  Ryan,  Tinkler  &  CresAvick,  1993;  Toet  & 
Walraven,  1996;Therrien  et  al.,  1997;  Waxman,  Gove,  Fay,  Racamato,  Garrick,  Seibert  & 
Savoye,  1997).  All  these  techniques  may  differ  on  the  algorithm  approach  but  they  all 
have  the  same  objective:  improving  the  image  quality  for  the  observer  (Krebs,  Scribner, 
McCarley,  Ogawa&  Sinai,  1999b). 

A  typical  color  fusion  process  transforms  the  dual  monochrome  bands  generated  at 
the  f  and  IR  sensors  onto  display  variables  such  as  the  red-green-blue  (RGB)  channels 
(based  on  human  trichromatic  vision  theory)  and  opponent  color  processing  (Waxman  et 
al.,  1997).  This  approach  takes  advantage  of  the  observer’s  color  vision  system  to 
introduce  additional  dimensionality  for  interpretation  through  color  contrast.  The  use  of 
color  in  image  fiision  was  frequently  advocated  under  the  argument  that  color  contrast  can 
provide  improved  detection  performance  when  added  to  luminance  contrast  (Peli  et  al., 
1999). 

If  the  final  result  of  a  fusion  process  is  presented  in  a  monochromatic  format,  the 
whole  capability  of  the  HVS  is  not  being  optimally  used.  Objects  viewed  by  low-light  and 
infrared  sensors  will  generally  have  the  same  spatial  characteristics,  but  they  will  have 
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completely  different  contrast  levels.  By  displaying  these  variations  as  color  differences,  as 
a  result  of  sensor  fusion  techniques,  target-background  contrast  should  be  improved  and 
the  dynamic  range  of  the  scene  should  be  increased  (Krebs  et  al.,  1998). 

There  is  a  very  important  distinction  between  a  color  scene  observed  by  the 
unaided  eye  through  the  HVS  and  a  processed  color  fused  image.  The  color  mapping 
process  is  affected  by  the  specific  infoimation  provided  by  the  two  sensors.  Since  neither 
sensor  is  in  the  visible  waveband,  the  color  algorithms  used  may  not  produce  imagery  that 
matches  the  colors  seen  by  the  HVS  (Steele  and  Perconti,  1997). 

It  is  difficult  to  assume  how  beneficial  multisensored  colored  images  are  going  to 
be  for  human  performance.  Some  experimental  evidence  indicates  that  object  recognition 
depends  on  stored  color  knowledge  of  object’s  chromatic  characteristics  (Joseph  & 
Proffitt,  1986),  therefore,  incongruency  of  false  color  images  may  be  disruptive  of 
perceptual  performance,  and  could  even  produce  worse  performance  compared  to  single¬ 
band  imagery  alone  (Sinai  et  al.,  1999a),  although  overall  evidence  is  equivocal  as  to  what 
role  color  plays  in  object  recognition. 

Another  major  limitation  to  sensor  fusion  systems  is  that  HVS  mechanisms  cannot 
be  duplicated  to  achieve  color  constancy,  as  it  was  previously  stated  in  the  HVS  section. 
Therefore,  varjdng  illumination  conditions  will  originate  different  chromatic 
representations  of  the  same  object.  As  it  was  stated  above,  past  psychophysical  research 
has  been  inconclusive  in  determining  what  is  the  role  of  color  in  human  visual  system  and 
therefore,  in  determining  what  utility  if  any  color  sensor  fusion  has  for  human 
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performance.  A  review  of  the  literature  might  clarify  which  is  the  current  situation  in  this 
field  that  has  intrigued  vision  researchers  for  many  years. 

E.  REVIEW  OF  THE  LITERATURE 

It  is  likely  that  color  vision  evolved  in  response  to  the  changes  of  human  behavior, 
during  the  process  of  adaptation  to  the  natural  environment.  In  Polyak’s  view,  color 
vision  evolved  to  facilitate  food  gathering,  involving  search  and  recognition  of  natural 
objects  (Polyak,  1957).  Color  might  facilitate  these  tasks  by  means  of  scene  segmentation. 
Walls  suggested  that  color  promotes  the  perception  of  contour  (Walls,  1942).  Color 
differences,  like  luminance  differences,  can  be  used  to  segment  images  into  regions 
containing  uiformation  about  individual  objects  and  provide  more  reliable  information 
about  object  shape,  because  shadows  also  produce  luminance  contours  (De  Valois  & 
Switkes,  1983). 

Color  may  also  serve  another  perceptual  function  apart  fi-om  scene  segmentation; 
object  recognition.  Although  virtually  no  object  can  be  recognized  on  the  basis  of  its 
color  alone,  the  colors  of  some  objects  are  less  arbitrary  than  others,  therefore  objects 
with  higher  color  diagnostidty,  could  be  recognized  in  a  faster  way  (Biedeiman  &  Ju, 
1988). 

Object  recognition  can  be  achieved  by  means  of  two  different  processes,  both 
separately  or  in  a  combined  way.  During  a  bottom-up  process,  color  is  used  to  define  the 
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contour  or  shape  of  the  objects  present  in  a  scene.  Qnce  the  shape  of  the  object  is  defined, 
human  memory  recognizes  the  object  based  on  that  contour.  This  is  normally  the  case  of 
objects  with  low  color  diagnosticity.  On  the  other  hand,  during  a  top-down  process,  color 
is  used  to  access  memory  in  a  direct  way,  and  allows  subjects  to  distinguish  an  object . 
among  others  of  similar  shape,  just  based  on  its  color.  This  is  the  case  of  objects  with  high 
color  diagnosticity. 

Assuming  that  scene  segmentation  is  a  bottom-up  process,  it  should  not  depend  on 
our  knowledge  of  the  colors  of  things.  Diagnosticity,  on  the  other  hand,  relies  on 
memory.  It  might  improve  object  recognition  by  restricting  the  set  of  possible  alternatives 
(Wurm  et  al.,  1993). 

There  is  disagreement,  however,  as  to  whether  or  not  color  is  actually  used  to 
facilitate  object  recognition  (Wurm  et  al.,  1993).  This  disagreement  can  be  attributed  to 
several  causes  including  differences  in  psychophysical  tasks  employed,  differences  in 
luminance  characteristics  across  the  color  conditions,  the  use  of  different  levels  of  shape 
degradation,  and  differences  in  types  of  objects  used  as  stimuli  and  color  formats 
employed. 

Three  tasks  have  typically  been  used  to  determine  the  role  of  color  in  object 
recognition.  These  tasks  are  classification,  verification  and  naming.  In  a  classification 
task,  participants  are  shown  pictures  or  words  that  refer  to  a  specific  predesignated 
category  (Price  &  Humphreys,  1989).  In  a  verification  task,  a  target  name  is  presented  to 
the  participants  and  they  must  answer  whether  a  subsequently  presented  stimulus  matches 
the  target  or  not  (Ostergaard  &  Davidofif,  1985;  Biederman  &  Ju,  1988;  Joseph  &  Proffitt, 
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1996).  During  naming  tasks,  participants  must  verbally  identify  the  object  shown  in  each 
stimulus  (Ostergaard  &  DavidofiF,  1985;  Biederman  &  Ju,  1988;  Price  &  Humphreys, 

1989;  Wurm  et  al.,  1993). 

Discrepancies  in  the  results  of  previous  research  studies  were  analyzed  by  then- 
own  authors.  Price  &  Humphreys  (1989)  stated  that  surface  information  may  affect  object 
naming  more  strongly  than  classification,  and  that  the  effects  of  naming  may  be  most 
pronounced  on  objects  that  require  most  differentiation,  because  extra  time  is  then 
required  to  differentiate  any  given  object  fi-om  its  structurally  related  competitors.  Joseph 
&  Proffitt  (1996)  argued  that  their  results  differed  fi-om  those  of  other  studies  for  many  of 
the  same  reasons  that  Price  &  Humphreys’s  results  differed.  Because  the  objects  they 
used  in  their  verification  tasks  generally  came  from  structurally  similar  categories  (animals, 
fruits,  vegetables,  and  flowers),  that  were  also  natural  categories,  it  was  not  surprising  to 
find  an  effect  of  surface  color  in  their  verification  task. 

Other  aspects  that  may  originate  discrepant  results  in  color  research  will  be 
considered  in  more  detail  in  this  section  while  accomplishing  the  hterature  review. 

Markoff  (1972)  measured  reaction  times  (RTs)  for  subjects  to  decide  which  of 
three  targets  (tank,  jeep  or  soldier)  was  present  in  a  black-and-white  or  color  slide.  The 
targets  were  hidden  in  real-world  backgrounds.  He  blurred  the  slides  to  evaluate  the 
interaction  of  spatial  resolution  and  color.  He  found  that  RTs  were  shorter  (and  error 
rates  lower)  for  the  color  slides,  and  the  advantage  of  color  over  black-and  —white 
performance  increased  with  great  blur.  These  results  indicate  that  color  is  helpful  in  a 
search  task  and  that  color  may  be  more  helpful  when  shape  information  is  degraded. 
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Ostergaard  and  Davidoff  (1985)  investigated  the  role  of  color  in  the  recognition 
and  naming  of  everyday  objects.  Their  study  was  based  on  the  idea  that  color  is  unhelpful 
in  shape  processing,  due  to  the  considerable  evidence  for  the  separate  processing  of  color 
and  shape  shown  by  anatomical  and  physiological  research.  Observations  in  monkeys 
indicate  that  the  primate  visual  system  consists  of  several  separate  and  independent 
subdivisions  that  analyze  different  aspects  of  the  same  retinal  image  as  color,  depth, 
movement,  and  orientation.  Human  perceptual  experiments  are  remarkably  consistent 
with  these  predictions  (Livingstone  &  Hubei,  1988).  Therefore,  they  tried  to  find  out  at 
which  stage  of  the  object  recognition  process  color  and  shape  information  combined, 
given  that  we  are  aware  at  the  end  of  this  process  if  objects  are  incongmently  colored 
(Perlmutter,  1980).  This  means  that  color  is  not  a  part  of  the  pictorial  coding  of  objects, 
i.e.,  it  is  not  necessary  in  order  to  describe  the  psychological  description  of  the  object,  but 
rather  it  is  stored  as  part  of  a  set  of  attributes  as  depth,  movement  or  orientation 
(Seymour,  1979).  They  hypothesized  that  any  benefit  from  having  color  vision  should  be 
obtained  fi'om  a  later  stage  than  identification  (Ostergaard  and  Davidoflf,  1985). 

The  first  experiment  considered  the  role  of  color  in  object  naming,  because  if  color 
afiFects  the  processing  of  objects  at  any  stage  fi’om  early  registration  to  the  availability  of 
the  object  name,  this  should  be  reflected  in  the  object  naming  latencies.  Twenty-four 
common  finits  and  vegetables  were  photographed  on  black-and-white  and  color  film. 

Each  participant  was  shown  the  complete  series  of  pictures  and  was  required  to  name  the 
objects  depicted  in  those  pictures.  Colored  pictures  produced  significantly  faster  response 
latencies  than  black-and-white  pictures.  Therefore,  the  authors  concluded  that  color 


31 


information  is  beneficial  in  object  naming,  although  it  was  not  clear  whether  those  results 
occurred  because  color  simply  provides  a  separate  cue  for  discriminating  stimulus  items 
as  part  of  a  bottom-up  process. 

In  the  second  experiment  they  tried  to  solve  this  problem  by  only  using  items  of 
identical  color  and  thereby  removing  the  possibility  of  using  color  to  discriminate  between 
alternatives;  They  compared  object  naming  directly  to  object  verification  with  three  t5^es 
of  stimuli:  items  depicted  in  their  natural  color  (always  red),  achromatic  versions,  and 
items  depicted  in  an  inappropriate  color  (always  blue).  This  final  condition  was  included 
to  determine  whether  an  inappropriate  color  would  be  interfering  or  merely  have  the  same 
effect  as  no  color.  In  the  naming  task,  participants  were  required  to  name  the  depicted 
objects  as  quick  as  possible  without  making  errors.  In  the  verification  task  participants 
were  required  to  respond  positively  when  the  target  item  was  presented.  Before  each 
block  of  trials,  the  participants  were  informed  what  color  the  stimuli  would  be  and  they 
were  shown  the  three  alternative  items.  Color  effect  reached  significance  in  the  naming 
task,  but  it  failed  to  reach  significance  in  the  object  verification  task.  The  naming 
advantage  found  for  red  pictures  of  objects  could  not  be  attributed  to  the  stimulus 
characteristics  of  the  colors  used,  so  they  concluded  that  it  should  be  due  to  the 
meaningful  conjunction  of  color  and  shape.  They  also  concluded  that  there  was  no 
detrimental  effect  of  wrong  color  compared  to  achromatic  input  (Ostergaard  and 
Davidoff,  1985). 

The  third  experiment  was  run  to  verify  the  generality  of  the  results  of  the  previous 
one.  In  Experiment  2,  the  stimuli  were  blocked  according  to  color  type.  In  this 
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experiment,  correctly  colored,  inappropriate  colored  and  achromatic  stimuli  were 
presented  at  random.  Again,  the  participants  accomplished  naming  and  verification  tasks. 
Positive  responses  during  the  verification  task  were  significantly  faster  than  negative 
responses.  All  other  factors  and  interactions  failed  to  reach  significance.  Item  color  was 
found  to  be  significant  in  the  naming  task.  Paired  comparisons  revealed  that  correctly 
colored  pictures  were  named  significantly  faster  than  either  achromatic  or  inappropriately 
colored  versions.  There  was  no  significant  difference  between  achromatic  and 
inappropriate  versions.  Thus,  the  major  result  of  Experiment  2  was  confirmed. 

These  results  show  that  color  facilitates  object  naming  but  not  object  recognition. 
They  found  faster  object  naming  for  color  than  black-and-white  pictures,  and  they  believed 
this  could  be  explained  by  assuming  that  objects  are  listed  in  semantic  representation  as  a 
collection  of  physical  attributes.  One  of  these  attributes  is  color,  and  they  postulated  that 
this  attribute  could  be  accessed  directly  by  the  physical  color  input.  Another  important 
conclusion  of  this  study  was  that  although  correct  color  produced  facilitation  of  object 
naming,  inappropriate  color  did  not  cause  significant  inhibition  in  either  Experiment  2  or  3. 

Beiderman  and  Ju  (1988)  investigated  the  role  of  color  in  object  recognition, 
comparing  the  latency  at  which  objects  could  be  named  or  verified  when  they  were  shown 
either  as  line  drawings  or  color  photographs.  The  empirical  issue  of  this  study  was  to 
determine  if  the  presence  of  surface  attributes  of  an  object,  such  as  color,  facilitates  the 
psychological  representation  of  that  object,  over  what  can  be  simply  derived  by  depiction 
of  the  object’s  edges.  Color  diagnosticity  among  objects  was  also  investigated  trying  to 
determine  whether  color  and  brightness  were  providing  a  contribution  to  recognition 
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independent  of  the  main  effect  of  photo  versus  drawings.  For  some  kinds  of  objects,  color 
is  diagnostic  of  the  object’s  identity.  For  other  kinds,  normally  man-made  objects,  color  is 
not  diagnostic.  If  cblor  was  contributing  to  object  recognition,  then  it  should  be  found 
that  the  former  kinds  of  objects  should  benefit  more  than  the  non-diagnostic  objects  by 
their  depiction  as  color  photographs  rather  than  as  line  drawings. 

It  was  expected  that  participants’  reaction  times  and  error  rates  for  the  naming 
task  would  be  smaller  for  color  stimuli.  In  the  verification  task,  each  presentation  was 
shown  in  one  of  three  different  exposure  times  followed  by  a  mask.  The  exposure  times 
were  50,  65  and  100  milliseconds  in  duration.  In  this  task  participants  could  anticipate  the 
surface  characteristics  of  almost  all  of  the  targets,  and  for  the  diagnostic  objects,  the  color 
as  well.  Therefore,  if  participants  were  using  the  color  to  access  an  object  mental 
representation,  then  objects  photographed  in  color  or  those  that  were  diagnostic  should  be 
recognized  faster  relative  to  the  naming  tasks.  It  was  also  assumed  that  longer  exposure 
duration  and  slightly  dimmer  projector  intensity  would  favor  colored  photography 
(Biederman  and  Ju,  1988). 

The  results  of  the  experiments  did  not  match  the  authors’s  assumptions.  Over  the 
five  experiments,  mean  RTs  and  error  rates  for  naming  or  verifying  line  drawings  were 
virtually  identical  to  those  for  color  slides.  Even  objects  with  diagnostic  colors  did  not 
enjoy  any  advantage  when  presented  as  color  slides  during  the  verification  task.  They 
found  a  mean  advantage  favorable  for  the  line  drawings  and  also  favorable  for  the  objects 
with  no  diagnostic  color.  The  conclusion  for  these  studies  was  that  a  simple  line  drawing 
could  be  identified  about  as  quickly  and  as  accurately  as  a  colored  photographic  image  of 
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that  same  object.  Color  diagnosticity  did  not  facilitate  object  recognition.  These  results 
support  the  premise  that  the  access  to  a  mental  representation  of  an  object  can  be 
accomplished  with  an  edge-based  representation  of  a  few  simple  components.  Color  plays 
only  a  secondary  role  in  recognition  when  edges  can  be  readily  extracted  (Biederman  and 
Ju,  1988). 

In  contrast  to  Biederman’ s  results,  some  authors  speculated  that  surface 
characteristics  should  facilitate  recognition.  Following  this  line.  Price  and  Humphreys 
(1989)  examined  the  effects  of  color  congruency  and  photographic  detail  on  the  naming 
and  classification  of  objects  from  structurally  similar  and  structurally  dissimilar  categories. 
Color  congruency  was  also  assessed  by  contrasting  performance  with  correctly  colored 
line  drawings  of  objects,  black-and-white  outline  drawings  and  line  drawings  assigned  very 
incongruent  colors. 

To  clarify  all  these  effects,  the  authors  conducted  a  series  of  experiments  in  which 
participants  performed  naming,  subordinate  (only  with  stimuli  from  structurally  similar 
categories)  and  superordinate  classification  tasks.  Color,  when  present,  was  part  of  the 
surface  description  of  objects  in  Experiments  1  and  2.  The  influence  of  color  in 
participants’  performance  as  part  of  the  object’s  surface  description  was  examined  in 
Experiment  3  by  testing  the  effects  of  colored  backgrounds  on  object  naming  (Price  and 
Humphreys,  1989). 

Price  and  Humphreys  hypothesized  that  object  color  and  surface  details  would  be 
beneficial  for  discriminating  between  categorical  members,  because  these  objects  require 
greater  differentiation  to  separate  the  target  object  from  competitors  of  the  same  category 
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during  a  naming  task.  Their  findings  supported  this  hypothesis  and  revealed  that  the 
influence  of  surface  color  in  object  recognition  is  not  only  reserved  for  naming  tasks. 
Classification  of  objects  can  also  benefit  fi’om  the  use  of  congruent  surface  color  when 
shape  information  is  not  sufficient  for  discriminating  among  category  members.  The 
implication  of  Price  and  Humphreys’s  findings  is  that  effects  of  surface  color  are  not 
necessarily  reserved  for  the  later,  name-retrieval  stages  of  processing  (Joseph  and  Proffitt, 
1996). 

Joseph  and  Proffitt  (1996)  conducted  a  series  of  experiments  to  determine  the 
influence  of  color  as  a  surface  feature,  i.e.,  its  role  during  a  bottom-up  process,  versus  its 
influence  accessing  stored  knowledge  during  a  top-down  process,  in  order  to  achieve 
object  recogmtion.  They  defined  stored  color  knowledge  as  semantic  information  about 
the  prototypical  colors  of  objects,  such  as  the  knowledge  that  apples  are  typically  red. 

They  considered  that  the  role  of  stored  knowledge  of  color  in  object  recognition 
had  not  been  examined  deeply  enough  in  previous  studies.  They  also  considered  that  the 
findings  in  the  literature,  concerning  the  role  of  color  in  object  recognition,  have  yielded 
mixed  results.  They  argued  that  for  surface  color  to  be  a  useful  cue  for  recognition,  the 
participant  must  decide  if  the  surface  color  is  appropriate  for  an  object.  Therefore,  they 
would  have  to  access  an  object’s  semantic  description  for  this  check  process  to  occur  and 
compare  it  with  the  surface  color  present  in  the  image.  This  study  investigated  whether 
the  decision  that  the  stimulus  matches  the  target  or  not,  depends  more  on  the  activation  of 
stored  knowledge  or  on  the  processing  of  surface  color. 
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Joseph  and  ProfiBtt  conducted  three  experiments  to  investigate  the  roles  of  surface 
color  and  stored  color  knowledge  in  object  recognition.  Pictures  of  natural  objects  were 
used  as  stimuli  because  most  of  natural  objects  have  prototypical  colors  as  opposed  to 
man-made  objects  in  which  color  is  quite  more  arbitrary.  In  this  way  they  could  measure 
the  influence  of  color,  when  participants  were  presented  natural  objects  showing 
completely  difierent  colors  fi'om  those  stored  in  their  memories.  According  to  the  results 
of  the  first  experiment,  congruent  surface  color  made  recognition  easier  than  did 
incongruent  color,  with  a  verification  task.  Congruently  or  incongruently  colored  line 
drawings  of  common  natural  objects  were  presented  briefly,  masked,  then  followed  by  a 
label.  An  object  is  considered  congruently  colored  when  it  is  showing  the  most 
prototypical  color,  that  is  the  one  that  the  vast  majority  of  people  have  stored  in  memory 
as  related  to  that  specific  object.  These  results  appeared  to  conflict  with  Biederman  and 
Ju’s  (1988)  findings,  but  the  authors  argued  that  the  reason  for  this  discrepancy  might  be 
the  different  natures  of  the  stimulus  sets.  Thus,  use  of  surface  color  as  a  cue  for 
recognition  is  more  beneficial  for  objects  firom  natural  categories  (Joseph  and  Proffitt, 
1996). 

Results  fi'om  the  second  and  third  experiments  confirmed  their  hypothesis.  They 
concluded  that  the  processing  of  shape  information  was  more  influential  than  any  source 
of  color  information  in  object  recognition,  but  when  response  interference  could  not  be 
attributed  to  shape  information,  i.e.,  when  both  stimulus  and  target  had  similar  shape, 
stored  color  knowledge  was  an  overriding  factor  relative  to  surface  color.  They  also 
found  that  tire  activation  of  stored  color  knowledge  did  not  depend  on  the  presence  of 
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surface  color,  because  even  the  identification  of  uncolored  pictures  was  affected  by  stored 
color  knowledge.  They  examined  the  effect  of  stored  color  knowledge  by  observing  RTs 
and  error  interference  when  semantic  associations  of  color  were  present  or  absent.  For 
example,  an  uncolored  picture  of  an  apple  might  have  been  followed  by  the  label  cherry  or 
by  the  label  blueberry.  More  interference  should  occur  with  the  label  cherry  because 
apples  and  cherries  share  a  prototypical  color.  Apples  and  blueberries  are  different  in 
prototypical  color  and,  therefore,  interference  should  be  less  (Joseph  and  Proffitt,  1996). 

Wurm,  Legge,  Isenberg,  and  Luebker  (1993)  investigated  the  role  of  color  in 
object  recognition  trying  to  find  out  if  color  facilitates  recognition  in  images  with  low 
spatial  resolution.  Previous  studies  have  concluded  (Markoff,  1972,  Ostergaard  and 
Davidoff,  1985,  Biederman  and  Ju,  1988)  that  color  improves  object  recognition  more, 
when  spatial  resolution  is  low  (blur  or  noise)  or  when  shape  information  is  less  specific 
(finits  and  vegetables  vs.  man-made  objects).  The  major  purpose  of  this  research  was  to 
examine  the  hypothesis  that  color  and  shape  information  interact  in  object  recognition, 
that  is,  color  facilitates  object  recognition  more  when  spatial  resolution  is  low. 

In  their  two  main  experiments,  participants  were  presented  full-color  and  gray¬ 
scale  images  of  twenty-one  different  food  items  and  vegetables.  They  chose  to  use  food 
objects  because  they  have  a  wide  range  of  colors  and  shapes,  so  they  were  representative 
of  natural  objects.  These  objects  may  provide  a  favorable  domain  for  revealing  a  role  of 
color,  given  that  color  vision  probably  evolved  in  response  to  functional  interaction  with 
natural  objects  (Polyak,  1957). 
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The  authors  considered  luminance  as  a  factor  that  could  have  led  to  disagreement 
among  the  results  of  earlier  studies  examining  the  role  of  color  in  object  recognition. 
Luminance  characteristics  varied  across  the  color  conditions  in  several  of  previous  studies. 
Li  Markoff  (1972)  and  Ostergaard  &  Davidoflf  (1985)  studies,  the  distributions  of 
luminance  were  not  matched  in  the  color  and  black-and-white  slides.  Biederman  and  Ju 
(1988)  compared  line  drawings  with  color  photographs.  Visual  analysis  of  the  color 
photographs  may  have  been  disadvantageous,  because  of  greater  difficulty  in  edge 
extraction  compared  with  line  drawings  (Wurm  et  al.,  1996).  To  avoid  this  problem, 
Wurm  and  colleagues  employed  only  gray-scale  images  matched  pixel  by  pixel  in 
luminance  with  the  color  images. 

Wurm,  Legge,  Isenberg  and  Luebker  (1993)  were  also  interested  in  examining  if 
color  and  shape  information  interact  in  object  recognition,  such  that  color  fadhtates 
recognition  more  when  spatial  resolution  is  low.  Psychophysical  and  computational 
studies  show  that  chromatic  contrast  sensitidty  is  confined  to  a  lower  spatial  frequency 
range  than  luminance  contrast  sensitivity  (Kelly,  1983;  Mullen,  1985;  Derrico  & 
Buchsbaum,  1991).  These  studies  support  the  authors’  hypothesis  about  interaction 
between  color  and  blur,  assuming  that  chromatic  contrast  (color  differences)  can  facilitate 
recognition  when  high-frequency  information  is  removed  by  a  shape  degrading  factor  as 
blur. 

In  one  experiment  of  Wurm  and  his  colleagues’  study,  both  full-color  and  gray 
scale  images  were  presented  in  two  resolutions,  blurred  and  unblurred,  during  a  naming 
task  RTs  were  shorter  in  the  fiill-color  unblurred  condition  and  longest  in  the  gray  scale 
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blurred  condition.  They  concluded  that  color  does  improve  recognition  of  food  objects 
whether  measured  as  accuracy  or  RT,  but  they  did  not  find  the  h5^othesi2ed  interaction 
between  color  and  spatial  resolution.  Two  additional  experiments  were  conducted  to 
examine  the  origins  of  this  effect.  They  investigated  if  shape  prototypicality  or  color 
diagnosticity  facilitated  object  recogmtion.  They  found  that  participants  were  faster  at 
recognizing  images  judged  to  be  highly  prototypical  (where  the  object  is  shown  fi-om  its 
most  common  point  of  view),  but  that  less  prototypical  images  benefit  more  fi^om  color, 
that  is,  show  a  greater  reduction  in  RT.  These  findings  are  consistent  with  Biederman  and 
Ju  (1988)  view  that  primary  access  to  object  recognition  uses  structural  (geometrical) 
representation  of  objects  and  this  representation  is  in  part  generated  by  the  presence  of 
color.  The  results  of  Experiment  5  suggested  that  participants’  explicit  knowledge  about 
food  color  (diagnosticity)  does  not  account  for  the  advantage  of  color  in  real-time  object 
recognition. 

The  authors  questioned  how  color  and  shape  could  act  additively  and  non- 
interactively  in  object  recogmtion.  They  argued  that  perhaps  color  contributes  to  an  early 
stage  of  contour  extraction  and  scene  segmentation  (De  Valois  and  Switkes,  1983;  Walls, 
1942).  That  role  is  likely  to  rely  on  low  spatial  frequencies  and  hence,  be  relatively 
insensitive  to  blur.  Thus,  they  concluded  that  although  color  does  improve  object 
recognition,  the  mechanism  is  probably  sensory,  rather  than  cognitive  in  origin. 

Otherwise,  it  would  be  related  to  people’s  knowledge  of  the  colors  of  things,  but  this 
would  not  match  with  the  results  of  their  color-diagnosticity  experiment  (Wurm  et  al., 
1993). 
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Several  of  these  experiments  concluded  that  shape  is  the  basic  element  in  object 
recognition  (Ostergaard  and  Davidofif,  1985;  Biederman  and  Ju,  1988;  Wurm  et  al.,  1993; 
Joseph  and  Proffitt,  1996)  and  that  color  plays  a  secondary  role,  facilitating  object  naming 
as  a  final  step  in  object  identification  (Ostergaard  and  Davidoff,  1985;  Wurm  et  al.,  1993). 
But  when  object  shape  is  degraded,  color  may  play  a  more  important  role  facilitating 
object  recognition  by  scene  segmentation  (Biederman  and  Ju,  19880). 

The  discrepancy  that  exists  among  the  researchers  related  to  the  role  of  color  in 
object  recognition,  has  originated  a  similar  question  about  the  use  of  color  in 
multisensored  fusion,  in  which  artificially  colored  images  are  supposed  to  improve  human 
performance  in  target  detection  and  situational  awareness.  Part  of  the  discrepancy  of 
these  latter  studies  may  be  originated  by  the  different  fusion  algorithms  used  or  by  the 
wide  variety  of  psychophysical  tasks  employed  to  measure  behavior  (Sinai  et  al.,  1999b), 
as  can  be  seen  in  the  following  summarized  experiments. 

Waxman,  Gove,  Seibert,  et  al.  (1996)  conducted  an  experiment  trying  to  evaluate 
human  perception  during  a  visual  search  task.  The  detection  of  embedded  small  low- 
contrast  targets  in  natural  night  scenes,  was  measured  in  terms  of  reaction  time,  and 
accuracy.  Visible,  infrared,  color  fused,  and  two  forms  of  fused  gray  scale  images,  were 
shown  to  the  participants,  whose  task  was  to  determine  whether  the  hidden  target  was  on 
the  right  half  or  the  left  half  of  the  screen.  Although  the  report  of  this  study  does  not 
show  any  statistics  supporting  the  results,  RTs  during  the  detection  task  were  fastest  when 
color  fused  imagery  was  used,  across  various  levels  of  target  contrast. 
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Toet,  Ijspeert,  Waxman  and  Aguilar  (1997)  investigated  if  the  increased  amount  of 
detail  in  the  fused  images  can  yield  an  improved  observer  performance  in  a  task  that 
requires  situational  awareness.  Fused  images  were  obtained  from  low  visible  and  thermal 
signals,  using  two  different  fusion  methodologies.  The  stimuli  presented  to  the 
participants  were  in  six  different  chromatic  formats:  Fused  color  images  generated  by  two 
fusion  algorithms,  gray  level  images  representing  the  luminance  component  of  the  fused 
color  images,  and  gray  level  images  representing  the  signals  of  the  low-visible  and  the 
infrared  cameras.  The  task  required  the  detection  and  localization  of  a  person  in  the 
displayed  scene,  relative  to  some  characteristic  details  that  provide  the  spatial  context. 

Visual  and  thermal  contrasts  were  low,  since  stimuli  were  collected  just  before  and 
after  sunrise.  Visual  contrast  was  low  due  to  low  luminance  of  the  sky.  Thermal  contrast 
was  also  low  due  to  the  similar  temperature  of  the  objects  in  the  scene.  The  authors 
hypothesized  that  the  fusion  of  images  registered  in  these  conditions  would  result  in 
images  that  represented  both  the  context  (background)  and  the  details  with  a  large  thermal 
contrast  (like  people)  in  a  single  composite  image.  The  results  showed  that  participants 
could  indeed  determine  the  location  of  a  person  in  a  scene  with  a  significantly  higher 
accuracy  when  they  performed  with  fused  images,  compared  to  the  other  chromatic 
formats.  The  two  color  fusion  algorithms  yielded  the  best  overall  performance,  producing 
error  rates  of  1.5%  and  1.9%,  while  the  corresponding  gray  scale  fused  images, 
respectively,  produced  error  rates  of  4.5%  and  4.9%.  The  error  rate  for  the  thermal 
images  was  8%,  and  fi)r  the  visual  images  was  20%.  The  authors  concluded  that  color 
contrast  in  fused  imagery  does  help  in  target  detection. 
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The  objective  of  the  study  conducted  by  Steele  and  Perconti  (1997)  was  to 
determine  whether  color  fusion  processes,  were  of  benefit  to  helicopter  pilots  in  the 
performance  of  night  helicopter  flights.  Specifically,  the  authors  investigated  whether 
adding  synthetic  color  to  night  vision  multisensor  (visible  near  infi'ared  and  long  wave 
infi-ared)  fused  imagery,  aided  pilots  in  interpreting  spatial  relationships  and  improving 
situational  awareness.  The  study  consisted  of  a  part  task  simulation,  with  three  task 
groups:  object  recognition  and  identification,  horizon  perception  and  geometric 
perspective  tasks.  Object  recognition  and  identification  tasks  are  those  tasks  that  required 
the  participant  to  either  determine  if  a  specific  object  was  present,  locate  a  specific  object 
and  determine  its  position  in  the  field  of  vision,  and  provide  detail  information  about  an 
object.  Horizon  perception  tasks  are  those  tasks  that  required  the  participant  to  determine 
whether  or  not  the  perceived  horizon  was  level.  Geometric  perspective  tasks  required  the 
participant  to  identify  the  shape  or  orientation  of  an  object  using  monocular  depth 
perception  cues. 

Images  were  presented  in  five  different  chromatic  formats:  monochrome,  FLIR 
monochrome,  a  gray  scale  fusion  algorithm  and  two  different  color  fusion  algorithms. 

Each  task  group  yielded  different  results  for  the  three  general  types  of  visual  tasks  used, 
although  in  general  fusion  based  formats  resulted  in  better  participant  performance.  The 
authors  concluded  that  the  benefits  of  integrating  synthetic  color  to  fused  imagery  are 
dependent  on  the  color  algorithm  used,  the  visual  task  performed,  and  scene  content.  In 
the  object  recognition  task,  both  the  FLIR  and  the  gray  scale  fusion  formats  resulted  in 
significantly  faster  RTs.  In  the  horizon  perception  tasks  no  significant  differences  were 
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found  among  response  times  and  accuracy.  In  geometric  perspective  tasks  the  gray  scale 
fusion  algorithm  produced  significantly  faster  RTs  than  the  FLIR  alone.  There  were  no 
significant  differences  among  the  other  formats.  The  two  color  fusion  algorithms 
examined  in  this  study  represent  two  very  different  approaches.  Therefore,  it  is  not 
surprising  to  find  these  two  algorithms  on  opposing  ends  on  some  of  the  data  plots  of  this 
study  (Steele  and  Perconti,  1997). 

The  purpose  of  the  study  conducted  by  Krebs  et  al.  ( 1 998)  was  to  modify  the 
existing  F/A-18  targeting  FLIR  system  by  adding  a  dual-band  color  sensor  to  improve 
target  contrast  and  standoff  ranges.  The  authors  argued  that  objects  viewed  by  low-hght 
and  infi’ared  sensors  would  have  dramatically  different  contrast  levels  between  each 
system.  Therefore,  displaying  these  variations  as  color  differences  should  improve  target- 
background  contrast  and  increase  the  dynamic  range  of  the  scene.  When  searching  for  a 
target,  color  should  help  by  giving  better  context  to  the  scene,  as  a  result  of  the  higher 
contrast  among  objects  in  the  scene  (scene  segmentation),  thus  allowing  for  more  efScient 
pilot  orientation  and  target  detection. 

This  experiment  used  eight  nighttime  video  sequences  collected  fi’om  an  early 
protot3q)e  fusion  sensor  system  developed  by  Texas  Instruments  and  the  Night  Vision  and 
Electronic  Sensors  Directorate  (NYESD).  Each  of  the  sequences  was  presented  in  five 
different  image  formats:  low  light  visible  imagery,  infi-ared  imagery,  gray  scale  fused 
imagery  and  two  different  color  fused  imageries.  It  was  hypothesized  that  these  images 
should  be  maximally  optimized  for  target  discrimination.  A  standard  visual  search  task 
was  used  to  assess  whether  pilots’  situational  awareness  was  improved  by  using  sensor- 
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fused  imagery.  Participants  responded  faster  to  the  infrared  target  compared  to  one  of  the 
color  fused  target,  while  the  other  color  fused  target  showed  no  significant  difference. 
These  results  generally  agree  with  Steele  and  Perconti’s  (1997)  study  that  used  the  same 
videotaped  sequences.  The  authors  concluded  that  color  fusion  did  not  improve  pilots’ 
situational  awareness.  Pilots  reported  that  the  color  fused  scene  appeared  unnatural  due 
to  the  choice  of  colors.  However,  pilots  did  report  that  color  fused  objects  were  easier  to 
discriminate  than  IR  or  f  objects,  because  of  the  color  contrast  that  facilitates 
discrimination  from  the  background  noise.  Therefore,  color  fusion  may  be  more 
appropriate  for  targeting  applications  compared  to  navigation  and  pilotage  appUcations. 

The  study  conducted  by  Sinai,  McCarley,  Krebs  and  Essock  (1999b)  compared 
performance  on  two  different  tasks,  an  object  recognition  task  and  a  situational  awareness 
task.  The  authors  hypothesized  that  performance  on  these  two  very  different  tasks  would 
be  differently  affected  both  by  the  single  sensor  imagery  and  by  the  fused  imagery.  They 
hypothesized  that  performance  on  the  detection/recognition  task  should  be  better  for  the 
IR  imagery,  because  IR  images  usually  have  higher  contrast  than  the  f  image.  Likewise, 
they  hypothesized  that  performance  should  be  slightly  better  for  the  I^  imagery  compared 
to  the  IR  imagery  for  the  situational  awareness  task,  because  IR  imagery  has  lower 
resolution  than  the  I^  imagery.  The  authors  also  argued  that  the  fused  imagery  would 
result  in  performance  at  least  as  good  as  the  better  of  the  two  single  band  sensors. 

Stimuli  were  images  collected  using  long-wave  IR  sensor  and  I^  low-light  sensor. 
Six  image  formats  were  tested:  single  band  IR  and  low-light  formats,  two  color-fused 
formats  and  two  achromatic  fused  formats,  with  each  of  the  fused  formats  using  IR 
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imagery  of  white-hot  polarity  or  black-hot  polarity  respectively.  Two  experiments  were 
conducted.  The  first  required  participants  to  detect  a  target  (person,  vehicle  or  neither) 
against  naturalistic  backgrounds.  The  second  measured  participants’  situational  awareness 
by  asking  them  to  decide  whether  the  scene  was  inverted  or  not. 

Significant  differences  were  found  between  the  white-hot  color  fused  error  rates 
and  the  white-hot  gray  scale  fused  error  rates  for  both  tasks.  Thus,  the  false-color  of  the 
fusion  algorithm  improved  performance  for  this  format  and  for  both  tasks.  The  only 
difference  between  the  two  formats  was  the  addition  of  color.  In  the  other  fused  format 
tested,  however,  color  not  only  did  not  improve  performance  but  even  actually  hindered 
performance  by  increasing  error  rates  in  both  tasks.  The  results  of  this  study  showed  great 
evidence  for  the  benefits  that  color-fused  imagery  can  produce  in  human  performance,  but 
also  demonstrated  how  drastically  results  may  vary  according  to  tasks  or  algorithms  used 
in  the  research  (Sinai  et  al.,  1999b). 

In  sum,  several  of  these  experiments  concluded  that  color  fusion  facilitates  target 
detection  (Waxman  et  al.,  1996)  and  situational  awareness  (Toet  et  al.,  1997;  Sinai  et  al., 
1999b),  one  concluded  that  just  targeting  applications  but  not  situational  awareness  may 
benefit  from  a  color  fusion  scene  (Krebs  et  al.,  1998)  while  Steele  &  Perconti  (1997) 
argued  that  the  benefits  color  fusion  depends  on  the  color  algorithm  used,  the  visual  task 
performed  and  scene  content. 

Summarizing  the  plausible  benefits  generated  by  the  use  of  colored  imagery,  there 
is  certain  evidence  that  color  may  play  an  important  role  in  object  recognition  when  shape 
is  degraded,  both  by  means  of  scene  segmentation  in  a  bottom-up  process,  and  by 
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accessing  stored  knowledge  in  a  top-down  process,  if  colored  imagery  is  congruent.  For 
these  same  reasons,  the  use  of  color  in  fused  imagery,  where  most  of  the  times  the 
contours  of  the  objects  will  not  be  sharply  defined,  seems  to  be  useful  too  in  object 
recognition  at  least  achieving  a  bottom-up  process  in  which  color  contrast  can  facilitate 
contour  definition,  although  there  exists  the  possibility  that  color  incongruency  may 
originate  disruptive  effects,  as  it  was  shown  in  the  reviewed  literature.  Therefore,  it  seems 
to  be  enough  support  to  move  forward  in  the  research  regarding  the  role  of  natural  color 
and  false  color  in  object  recognition. 


F.  HYPOTHESIS 

In  an  effort  to  continue  in  this  field  of  research,  avoiding  some  of  the  deficiencies 
detected  in  the  past  and  summarizing  several  of  different  techniques  used  in  previous 
studies,  this  thesis  will  conduct  a  human  performance  experiment  by  measuring  reaction 
times  and  error  rates  during  a  standard  object  naming  task,  trying  to  examine  how  natural 
and  artificial  color  facilitates  object  recognition  when  objects’  shape  information  is 
degraded.  Naming  was  chosen  as  the  psychophysical  task  for  this  experiment  for  two 
reasons.  First,  it  provides  a  way  to  check  the  accurate  and  positive  identification  of  the 
object  presented  in  each  stimulus  to  the  participant.  Also,  because  if  color  affects  the 
processing  of  objects  at  any  stage  from  early  registration  to  the  availability  of  the  object 
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name,  this  should  be  reflected  in  the  object  naming  latencies  (Ostergaard  and  Davidoff, 
1985). 

Digital  photographs  of  natural  objects  (fixiits  and  vegetables)  were  used.  Common 
food  objects  also  provide  a  favorable  domain  for  stud5dng  the  interaction  of  color  and 
shape  due  to  their  natural  although  limited  variation  of  both  attributes  within  a  category 
(Wurm  et  al.,  1993).  The  familiarity  and  non-arbitrary  colors  of  these  objects  might 
encourage  participants  to  use  color  for  recognition,  perhaps  especially  in  those  cases  when 
shape  is  no  so  helpful  due  to  its  similarity  among  several  stimuli. 

The  effect  of  natural  and  artificial  color  was  examined  by  comparing  natural  and 
false  color  imagery  with  their  gray  scale  counterparts  as  a  control  for  luminance.  Since 
colored  images  and  their  gray  scale  equivalents  were  matched  in  luminance,  any  advantage 
measured  in  the  colored  imagery  should  have  been  originated  by  the  presence  of  color. 

Gaussian  monochromatic  noise  was  used  as  an  image-degrading  factor.  The  aim 
of  using  noise  was  to  achieve  some  type  of  image  degradation  in  order  to  examine  how 
recognition  might  be  affected  under  the  degraded  viewing  conditions  that  occur  with  night 
vision  devices. 

It  was  hypothesized  that  if  stored  color  knowledge  affects  object  recognition, 
shorter  RTs  and  smaller  error  rates  would  occur  within  the  natural  color  images  across  all 
levels  of  noise,  and  that  the  difference  in  RTs  and  error  rates  between  natural  color  and 
false  color  images  would  be  largest  in  the  conditions  with  the  greatest  amount  of  noise. 
Faster  RTs  and  smaller  error  rates  were  expected  within  the  natural  color  images  because 
participants  would  be  able  to  use  color  iirformation  to  access  stored  knowledge  of  the 
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participants  would  be  able  to  use  color  information  to  access  stored  knowledge  of  the 
objects’  chromatic  features.  Larger  effects  of  natural  color  images  were  also  expected  in 
the  conditions  with  higher  levels  of  noise  because  here,  since  the  objects’  shape 
information  was  degraded,  subjects  might  be  encouraged  to  rely  more  heavily  on  color 
information  to  recognize  the  stimuli.  The  longest  RTs  and  greatest  error  rates  were 
expected  within  the  gray  scale  images,  because  participants  would  not  be  able  either  to 
accomplish  scene  segmentation  or  to  access  stored  knowledge  during  the  object 
recognition  task.  Intermediate  results  would  be  achieved  by  false  color  images,  because 
participants,  although  they  were  not  able  to  access  stored  knowledge  of  color,  at  least  they 
would  be  able  to  achieve  scene  segmentation  and  speed  up  recognition,  compared  to  gray 
scale  images.  The  four  stated  hypotheses  are  summarized  below: 

•  Shorter  RTs  and  smaller  error  rates  were  expected  within  natural  color  stimuli 
across  all  levels  of  noise. 

•  Differences  in  RTs  and  error  rates  between  natural  color  and  feilse  color  stimuli, 
were  expected  largest  for  greatest  levels  of  noise. 

•  Longest  RTs  and  greatest  error  rates  were  expected  within  the  grayscale  images. 

•  Intermediate  results  were  expected  for  false  color  images. 

These  will  be  the  Alternative  Hypotheses  for  the  statistical  tests.  The  Null 
Hypotheses  will  be  that  there  are  no  differences. 
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ni.  EXPERIMENT 


A.  METHODS 

1.  Participants 

Thirteen  students  (eleven  male  and  two  female)  from  various  military  services  and 
job  specialties,  undergoing  academic  studies  at  the  Naval  Postgraduate  School,  and 
ranging  in  age  from  28  to  38,  voluntarily  participated.  Participants  who  volunteered  for 
this  study  might  not  represent  a  broad  spectrum  of  the  population  but  their 
psychophysical  characteristics  were  very  similar  to  those  of  potential  NVDs  operators. 

All  participants  were  screened  for  normal  color  vision  with  the  Pseudo-isochromatic 
Plates  and  had  at  least  20/20  corrected  vision.  Participants  were  naive  to  the  purpose  of 
the  experiment.  All  participants  were  native  English  speakers.  All  participants  granted 
informed  consent  prior  to  participation. 

2.  Apparatus 

The  experimental  workstation  consisted  of  a  200  MHz  Pentium  personal 
computer  equipped  with  a  Texas  Instrument  TMS-340  Video  Board  and  the 
corresponding  TIGA  Interface  to  Vision  Research  Graphics  software.  The  stimuli  were 
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MF-8521  Ifigh  Resolution  color  nionitor  (21”  X  20”  viewable  area)  equipped  with  an 
anti-reflect,  non-glare,  P-22  short  persistence  CRT.  Pixel  size  was  .26”  horizontal  and 
.28”  vertical.  Resolution  was  800  X  600  square  pixels  and  the  frame  rate  was  98.7  Hz. 
Luminance  of  the  monitor  was  linearized  by  means  of  an  eight-bit  color  look-up  table 
(LUT)  for  the  red,  blue  and  green  guns.  Moderate  ambient  luminance  was  maintained 
during  the  test.  \^ewing  distance  was  approximately  100  cm  and  the  participants  were 
free  to  move  their  heads. 

3.  Stimuli 

Experimental  stimuli  were  digital  images  of  twenty-three  fruits  and  vegetables 
whose  names  appear  in  Table  1.  Images  were  photographed  by  the  experimenter  with  a 
Kodak  digital  camera.  Model  DC50.  Objects  were  photographed  in  the  early  afternoon 
under  natural  daylight  against  a  background  of  white  paper.  Each  object  was 
photographed  from  four  different  viewpoints,  all  of  them  judged  to  be  canonical  by  the 
experimenter.  Viewing  distance  was  varied  in  order  to  make  all  objects  occupy 
approximately  the  same  area  within  the  photograph.  From  the  initial  set  of  twenty-three 
objects  photographed,  the  nineteen  objects  judged  by  the  experimenter  to  be  the  most 
common  and  easy  to  name  were  selected  for  use  in  the  experimental  trials.  The  four 
remaining  objects  were  reserved  for  use  in  the  practice  trials. 

Images  were  then  manipulated  using  commercially  available  image  processing 
software  (Adobe  Photoshop  4.0).  Images  were  first  cropped  to  a  rectangle  600  X  500 
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STIMULUS  OBJECTS 


APPLE 

AVOCADO(*) 

BANANA 

BEANS 

BROCOLI 

CABBAGE 

CARROT 

COCONUT 

CORN 

CUCUMBER 

GARLIC(*) 

GRAPES 

LEMON 

MUSHROOM 

ONION 

ORANGE(*) 

PEAS 

PEAR 

PEPPER 

POTATO 

RADISH 

TOMATO 

ZUCCHBSflC*) 

(*)  Object  used  for  practice  trials  only. 


Table  1:  Object  names. 


pixel  size,  subtending  1 1.4°  X  10.2°  of  visual  angle  from  a  viewing  distance  of  100  cm, 
and  then  rendered  in  each  of  four  different  color  formats:  natural  hue  chromatic,  natural 
hue  achromatic,  false  hue  chromatic  and  false  hue  achromatic.  False  hue  images  were 
obtained  by  replacing  the  color  of  each  pixel  in  a  natural  hue  image  with  its 
complementary  color.  Complementary  colors  are  a  pair  of  colors  which  when  mixed 
additively,  appear  as  white.  Along  a  color  wheel,  complements  are  any  two  colors 
separated  by  180  degrees,  that  is,  any  two  colors  at  opposite  ends  of  a  single  diameter.  For 
consistency,  all  images  of  natural  hue  were  reassigned  a  value  of +180  degrees  in  order  to 
get  their  false  color  counterparts.  The  value  of  +180  degrees  was  chosen  arbitrarily  and  is 
of  no  theoretical  significance.  Natural  hue  achromatic  and  false  hue  achromatic  images 
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chromatic  natural  hue  and  false  hue  images,  respectively,  to  gray  scale.  Gray  scale  images 
matched  their  chromatic  counterparts  in  pixel-by-pixel  luminance.  The  purpose  of  these 
gray  scale  images  was  to  provide  a  control  for  any  changes  in  luminance  that  accompanied 
manipulations  of  color  within  chromatic  images. 

Once  the  four  sets  of  stimuli  were  obtained,  degraded  images  were  produced  by 
addition  of  achromatic  gaussian  noise  to  undegraded  images.  Noise  was  added  by 
increasing  or  decreasing  the  intensity  of  each  pixel  within  an  image  by  a  value  draw 
pseudo-randomly  from  a  Gaussian  distribution  with  a  mean  of  0.  The  standard  deviation 
(SD)  of  this  distribution  determined  the  level  of  degradation.  Three  levels  of  noise  were 
used.  At  the  first  level  images  were  not  degraded  (SD  of  noise  distribution  =  0).  At  the 
second  and  third  levels,  images  were  degraded  with  values  drawn  from  Gaussian 
distributions  with  SDs  of  50  and  100  units  respectively.  Color  formats  and  degradation 
levels  were  crossed  factorially. 

Mean  luminance  of  each  image  in  the  experimental  set  of  stimuli  was  calculated. 
The  mean  value  and  standard  deviation  of  the  luminance  values  for  each  color  format  and 
level  of  noise  was  computed.  These  results  are  shown  in  Table  2.  As  it  is  shown  in  Table 
2,  average  luminance  for  all  color  formats  is  almost  a  constant  for  each  level  of  noise, 
although  values  of  individual  images  changed  more  drastically.  Pixel  values  of  less  than 
zero  or  greater  than  255  were  set  to  values  of  zero  and  255,  respectively,  when  noise  was 
added.  Therefore,  mean  pixel  values  decreased  slightly  as  noise  increases,  tending  toward 
a  value  of  128  (50  cd/m2).  The  mean  luminance  for  each  chromatic  format  is  almost 
constant  too,  (Natural  hue  chromatic  =  58.15  cd/m^,  false  hue  chromatic  =  57.92  cd/m^. 
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cd/m^,  natural  hue  monochromatic  =  58.08  cd/m^  false  hue  monochromatic  =  58.12 
cd/m  ).  Therefore,  differences  in  RTs  between  color  images  and  their  gray  scale 
counterparts  cannot  be  attributed  to  differences  in  luminance. 


COLOR  FORMAT 

Natural  Hue 
Chromatic 

False  Hue 
Chromatic 

Natural  Hue 
Monochro¬ 
matic 

False  Hue 
Monochro¬ 
matic 

NOISE 

LEVEL 

0 

58.84 

58.61 

58.85 

58.83 

9.31 

8.63 

9.20 

8.43 

1 

58.69 

58.45 

58.60 

58.63 

8.88 

8.82 

8.81 

8.07 

2 

56.89 

56.70 

56.79 

56.89 

7.08 

6.56 

7.13 

6.35 

Table  2:  Mean  values  and  standard  deviations  of  luminance  for  each  color 

format  and  level  of  noise  in  cd/m^. 


4.  Procedure 

Each  of  nineteen  objects  was  presented  at  random  in  twelve  different  formats,  and 
from  two  different  points  of  view  selected  at  random  from  among  the  four  available 
views  of  each  object.  A  total  of  456  (19  X  12X2)  stimuli  were  presented  to  each 
participant  as  experimental  trials,  and  fifteen  stimuli  were  presented  for  practice 
purposes. 
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Each  subject  was  thoroughly  briefed  on  the  background  and  procedures  of  the 
experiment  and  was  given  the  opportunity  to  ask  questions.  Prior  to  testing,  participants 
read  a  list  of  the  nineteen  food  categories  in  the  experimental  trials  and  the  four  categories 
in  the  practice  trials.  They  were  told  that  their  task  was  to  name  as  rapidly  and  accurately 
as  possible  the  object  presented  in  each  stimulus.  They  were  also  told  that  the  objects 
would  fill  the  viewing  window  on  the  screen  (i.e.,  there  would  be  no  scale  information) 
and  that  items  could  appear  alone,  or  in  groups  of  items  of  the  same  type  (beans,  grapes, 
peas  and  radishes). 

Participants  were  tested  one  at  a  time.  They  were  seated  in  front  of  the  monitor  at 
distance  of  100  cm.  The  experimenter  was  seated  in  front  of  a  different  monitor  in  the 
same  room,  from  which  he  could  not  see  the  stimulus  images  and  remained  unaware  of  the 
format  in  which  each  image  was  presented.  There  was  a  warning  tone  to  alert  the 
observer  that  the  trial  was  going  to  start  followed  by  a  pause  of  500  msec  before  the 
image  was  presented  on  the  screen.  Each  image  remained  on  the  screen  until  the  observer 
responded.  Upon  hearing  the  observer’s  response,  the  experimenter  immediately  pressed 
a  key  to  stop  the  timer  and  one  more  key  to  record  the  accuracy  of  the  response  (1  for 
“true”,  2  for  “false”),  based  on  the  correct  response  that  appeared  on  the  experimenter’s 
monitor.  No  feedback  was  provided  following  any  response.  The  subsequent  trial 
followed  after  an  intertrial  interval  (ITI)  of  approximately  1,000  msec.  A  uniform  gray 
patch  of  the  same  size  as  the  food  images  was  shown  on  the  screen  during  ITI’s.  The 
experimenter  allowed  participants  to  relax  for  a  short  period  of  time  after  each  group  of 
40  images.  Each  participant  completed  two  experimental  sessions,  with  each  session 
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composing  a  block  of 228  trials.  Within  a  block,  participants  observed  each  of  the 
nineteen  objects,  once  in  each  of  the  twelve  image  formats  (19  X  12).  The  point  of  view, 
from  which  each  object  was  viewed,  was  chosen  randomly.  The  entire  experiment  lasted 
approximately  50  minutes  (25  minutes  for  each  session). 

Each  subject  was  given  a  block  of  1 5  practice  trials  prior  to  the  first  session  of  the 
experiment.  All  the  participants  held  both  sessions  on  the  same  day.  The  practice  trials 
were  presented  in  the  same  format  as  the  experiment  stimuli.  Upon  completion, 
participants  were  offered  a  brief  pause  to  ask  any  question.  The  actual  experiment  then 
started.  Reaction  time  and  accuracy  were  recorded  for  each  trial. 

5.  Experimental  Design 

The  experimental  design  for  this  study  was  a2X2X3X2  within-subjects 
factorial  design.  Factors  included  color  (chromatic  or  achromatic),  hue  (natural  or  false), 
level  of  noise  (0,50,100),  and  block  (Sessions  1  or  2).  Sex  was  not  considered  as  a  factor 
for  this  experiment.  Each  of  the  12  cells  contained  38  observations  per  participant  (19 
objects  X  2  replications)  for  a  total  of 456  data  points  per  participant.  Reaction  times  for 
incorrect  responses,  and  all  data  from  the  15  practice  trials  were  excluded  from  data 
analysis.  Previous  calculations  determined  that  the  number  of  participants  (13)  was 
sufficient  in  order  to  achieve  statistical  power  greater  than  0.8  under  all  hypotheses. 
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B.  DEVELOPMENT  OF  THE  STATISTICAL  MODEL 


As  with  any  psychophysical  experiment,  inconsistencies  occur  between  various 
participants’  RTs  and  error  rates  under  different  experimental  conditions.  These 
differences  arise  in  part  due  to  variances  in  individual  participants,  image  conditions  and 
manipulation.  To  account  for  these  differences,  the  following  model  is  proposed: 

RTijktam  (or  Error  rates)  =  Parti+Colorj  +  Huek  +  Noisei  +  Blocl^  +  error„ 

Where  Parti  represents  the  ith  participant  of  the  ejqperiment; 

Colorj  represents  the  color  of  the  image  (chromatic  or  achromatic); 

Huck  represents  the  hue  condition  of  the  image  (natural  or  false); 

Noisei  represents  the  level  of  achromatic  Gaussian  noise  (0,50, 100): 

Blocl^  represents  each  of  the  two  sessions  of  the  experiment  (SI  or  S2); 
errorm  represents  unaccoimted  variations  within  the  model. 


C.  RESULTS  AND  DISCUSSION 

Mean  RTs  and  error  rates  for  all  twelve  combinations  of  color  conditions  and 
levels  of  noise,  averaged  across  participants,  are  shown  in  Table  3  and  Table  4.  These 
same  results  are  also  represented  graphically  in  Fig.  1 1  and  Fig.  12. 

Figure  1 1  illustrates  how  RTs  increased  for  all  four  color  formats  when  noise 
increased.  Natural  color  (NC)  RTs  were  the  shortest  at  each  level  of  noise  compared  to 
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Table  3:  Mean  reaction  times  and  standard  errors  (msec)  for  each  level  of  noise  and 


color  condition. 


COLOR  CONDITION 


Natural  hue 
Chromatic 

Natural  hue 
achromatic 

False  hue 
chromatic 

False  hue 
Achromatic 

NOISE 

LEVEL 

0 

0.40 

1.21 

0.61 

2.23 

0.28 

0.44 

0.34 

0.66 

1 

1.62 

3.04 

3.24 

4.05 

0.57 

0.78 

0.72 

0.84 

2 

3.04 

8.12 

8.30 

10.53 

0.88 

1.37 

1.28 

1.49 

Table  4:  Mean  error  rates  and  standard  errors  (%)  for  each  level  of  noise  and  color 

condition. 
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the  other  color  formats,  as  was  predicted  by  the  first  hypothesis.  The  differences  in  RTs 
between  NC  and  FC  conditions,  however,  did  not  increase  as  the  level  of  noise  increased, 
in  opposition  to  the  statement  that  was  made  in  the  second  hypothesis.  These  differences 
were  almost  constant  across  all  levels  of  noise,  although  RTs  for  NC  images  were  always 
shorter  than  FC  RTs  at  the  same  level  of  noise. 

At  level  of  noise  0,  RTs  were  shortest  for  natural  hue  formats  (NC  and  NG),  and 
longest  for  artificial  hue  conditions  (FC  and  FG).  As  the  levels  of  noise  increased,  RTs  for 
the  achromatic  conditions  (NG  and  FG),  became  slower  compared  to  the  chromatic 
conditions  (NC  and  FC).  The  difference  in  RTs  between  chromatic  and  achromatic  stimuli 
increased  as  noise  increased,  reaching  the  longest  RTs  at  the  highest  level  of  noise  as  it 
was  stated  in  the  third  hypothesis.  Also,  although  FC  condition  had  the  longest  RT  at 
level  of  noise  0,  as  noise  increased  its  performance  improved  and  it  showed  the  second 
best  result  at  level  of  noise  2,  as  it  was  stated  in  the  fourth  hypothesis. 

There  was  a  great  similarity  between  RTs  and  error  rates  results.  Figure  12 
illustrates  how  NC  error  rates  were  smaller  for  each  level  of  noise  compared  to  the  other 
color  formats,  as  it  was  stated  in  the  first  hypothesis.  At  level  of  noise  0,  error  rates  were 
very  similar  for  NC  and  NG  formats.  FC  error  rates  were  almost  the  same  as  NC  error 
rates,  although  for  levels  of  noise  1  and  2,  FC  error  rates  were  more  similar  to  the 
achromatic  formats  (NG  and  FG),  and  its  differences  with  NC  stimuli  increased  as  the 
level  of  noise  increased,  as  was  predicted  by  the  second  hypothesis. 
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Both  figures  show  larger  RTs  and  error  rates  for  the  achromatic  conditions  and 
intermediate  results  for  the  FC  conditions  in  each  level  of  noise,  as  was  predicted  by  the 
third  and  fourth  hypotheses. 

Using  the  same  results  of  the  experiment,  this  study  conducted  the  measurement  of 
the  advantage  of  using  natural  color  versus  false  color,  for  different  levels  of  noise. 
Because  changes  of  hue  also  entailed  changes  of  luminance  within  stimulus  images,  a 
direct  comparison  of  RTs  to  NC  and  FC  images  cannot  indicate  effects  of  natural 
chromatic  information.  In  order  to  assess  the  benefit  of  the  use  of  color,  therefore.  Tables 
5  and  6  express  the  differences  in  RTs  and  ERs,  between  achromatic  and  chromatic 
natural  conditions  (NG-NC)  and  between  achromatic  and  chromatic  artificial  conditions 
(FG-FC)  at  each  level  of  noise.  These  values  will  be  referred  to  as  natural  color  benefit 
and  false  color  benefit.  If  there  was  any  benefit  originated  by  the  use  of  color,  chromatic 
conditions  should  have  obtained  better  results  than  their  gray  scale  counterparts  and  this 
advantage  should  have  increased  with  increasing  levels  of  noise.  Were  the  benefits  of 
color  rendering  the  exclusive  result  of  facilitated  image  segmentation,  furthermore,  then 
effects  of  natural  and  false  color  should  have  been  similar  across  all  levels  of  noise. 
Conversely,  was  natural  color  useful  in  accessing  the  stored  information  necessary  for 
naming  stimulus  items,  natural  color  benefit  should  have  exceeded  the  benefits  of  false 
color,  particularly  at  high  levels  of  image  noise  where  information  about  the  stimulus 
items’  shapes  was  most  severely  degraded.  Also,  these  differences  are  shown  in  Fig.  13 
and  Fig.  14. 
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Figure  13  shows  almost  no  advantage  at  level  of  noise  1  for  natural  conditions  and 
a  small,  although  non-significant,  disadvantage  for  artificial  conditions.  As  noise 
increases,  color  benefits  increase  too,  with  the  largest  advantage  for  the  largest  level  of 
noise  and  very  similar  advantages  both  for  natural  and  for  artificial  conditions.  The  use 
of  either  natural  or  artificial  color  seems  to  be  similarly  helpful  regarding  RTs  in  a 
recognition  task,  although  NC  RTs  are  always  faster  than  their  FC  counterparts. 


Natural 

Artificial 

Level  0 

4.48 

-17.06 

Level  1 

30.89 

29.95 

Level  2 

75.78 

82.04 

Table  5:  Mean  reaction  time  differences  (msec)  for  each  level  of  noise  and 

hue. 


Natural 

Artificial 

Level  0 

0.81 

1.62 

Level  1 

1.42 

0.81 

Level  2 

5.08 

2.23 

Table  6:  Mean  error  rate  differences  (%) 


or  each  level  of  noise  and  hue. 
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100.00 
80,00 
60.00 
40,00 

RTs 

20-00 
0.00 
-20.00 
-40.00 

Noise 

Figure  13:  Mean  reaction  times  color  advantage  (msec)  for  each  level  of  noise  and 

hue. 


Figure  14  illustrates  how  for  natural  conditions,  color  benefit  for  error  rates 
increased  with  the  level  of  noise,  but  artificial  conditions  did  not  show  a  similar  increase 
of  benefit  in  error  rates  when  the  level  of  noise  increased,  although  these  differences  were 


Figure  14:  Mean  error  rates  color  advantage  (%)  for  each  level  of  noise  and  hue. 
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non-significant.  Nevertheless,  there  always  existed  certain  benefit  in  accuracy  for  each 
level  of  noise  and  each  color  condition. 

To  determine  the  appropriate  statistical  method  to  be  used  in  the  analysis  of  the 
results  of  this  experiment,  normality  tests  were  conducted,  using  histograms  and  normal 
QQ-plot  of  residuals  as  diagnostic  plots.  These  tests  showed  that  the  data  failed  to  follow 
the  properties  of  normality.  In  order  to  satisfy  the  assumption  of  normality,  power 
transformations  were  applied  to  the  RT  and  error  rates  data.  RTs  on  trials  on  which 
participants  responded  incorrectly  were  treated  as  missing  values.  Repeated  measures 
analyses  of  variance  (ANOVAs)  were  performed  on  the  transformed  data,  both  for  RTs 
and  for  error  rates  with  a  significance  level  of  0.01 .  Although  the  analyses  were 
performed  on  1/RT  and  squared  root  of  ER,  the  terms  RT  and  error  rates  are  used  for 
convenience  throughout  this  study.  Mean  RTs  in  milliseconds  (msec)  and  mean  error 
rates  in  percent  for  each  of  the  participants  were  calculated  firom  individual  performances 
for  each  condition.  When  RT  and  error  rates  means  are  reported  in  the  text,  these  are 
untransformed  data. 

The  analysis,  with  participants  as  a  random  variable,  was  a  2X2X3X2  repeated 
measures  ANOVA,  with  the  independent  variables  Hue  (natural,  false).  Color  (chromatic, 
achromatic).  Noise  (0,  50,  100),  and  Block  (1,2).  The  first  three  independent  variables 
were  repeated  within  subjects.  The  same  ANOVA  that  was  used  for  RTs  was  also  used 
for  error  rates  analysis. 

The  a  priori  hypotheses  and  some  interesting  interactions  can  be  explored  using 
univariate  analysis  on  RT  and  error  rates  separately.  ANOVA  on  the  dependent  variable 
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RT,  showed  significant  main  effects  for:  Color:  F(l,5674)  =  20.7366,  £  <  0.01;  Hue: 
F(l,5674)  =  36.0931,  £  <  0.01;  and  Noise:  F(2,5674)  =  141.0512,  £  <  0.01.  Participants 
responded  significantly  faster  when  images  were  chromatic  rather  than  achromatic,  when 
hue  was  natural  rather  than  artificial  and  when  images  were  less  degraded  by  Gaussian 
noise.  Mean  RT  for  chromatic  images  was  978.94  msec  (Std.  Dev.  122.4  msec)  and  for 
achromatic  images  was  1013.28  msec  (Std.  Dev.  131.1  msec).  Mean  RT  for  natural 
images  was  973.68  msec  (Std  dev  120.6  msec)  and  for  artificial  images  was  1018.54  msec 
(Std.  Dev.  131.1  msec).  Mean  RTs  for  the  three  different  levels  of  noise  were:  Level  0: 
935.98  msec  (Std.  Dev.  105.4  msec).  Level  1:  973.24  msec  (Std.  Dev.  106.4  msec),  and 
Level  2:  1079.1 1  msec  (Std.  Dev.  124.8  msec).  The  ANOVA  results  and  the  above  values 
clearly  support  a  significant  difference  in  mean  RTs  across  color,  hue  and  noise  conditions. 

Similar  results  were  obtained  for  the  dependent  variable  ER.  ANOVA  showed 
also  significant  main  effects  for:  Color:  F(l,285)  =  17.5616,  £  <  0.01;  Hue:  F(l,285)  = 
16.9579,  £  <  0.01;  and  Noise:  F(2,285)  =  54.6472,  £  <  0.01 .  Mean  error  rates  fijr 
chromatic  images  was  2.87  (Std.  Dev.  3.58)  and  for  monochromatic  images  was  4.86  (Std. 
Dev.  4.89).  Mean  error  rates  for  natural  images  was  2.91  (Std  dev  3.76)  and  for  artificial 
images  was  4.82  (Std.  Dev.  4.77).  Mean  error  rates  for  the  three  different  levels  of  noise 
were:  Level  0:  1.11  (Std.  Dev.  1.76),  Level  1:  2.99  (Std.  Dev.  2.71),  and  Level  2:  7.50 
(Std.  Dev.  5.11 ).  The  ANOVA  results  and  the  above  values  clearly  support  a  significant 
difference  in  mean  error  rates  across  color,  hue  and  noise  conditions.  Participants  were  not 
only  faster  but  also  more  accurate  when  chromatic,  natural  and  non-degraded  images  were 
shown  to  them. 
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Once  the  significance  of  the  main  effects  was  tested,  the  factorial  interactions  were 
analyzed.  The  next  set  of  figures  was  constructed  to  assist  in  analyzing  factorial 
interactions  for  both  dependent  variables.  Figure  15  and  Figurelb  illustrate  the  Hue  by 
Color  interaction.  The  lines  joining  mean  RTs  and  error  rates  for  the  same  Color  level  are 
roughly  parallel  across  the  two  levels  of  Hue.  Apparently,  there  is  no  interaction  between 
these  two  factors  at  the  1%  significance  level.  ANOVA  yields  these  results;  Dependent 
variable  RT  F(l,5674)  =  4.5517,  p.=  0.0329;  dependent  variable  ER:  F(l,285)  =  1.1320,  p 
=  0.2882,  confirming  the  assumption  derived  fi-om  the  graph  inspection. 

The  non-significance  of  this  interaction  for  both  dependent  variables  shows  that 
although  participants  were  faster  and  more  accurate  with  chromatic  images  compared  to 
their  achromatic  counterparts,  these  differences  in  RTs  and  error  rates  were  similar  for 
both  natural  and  false  hue.  It  seems  that  the  advantage  of  NC  images  over  NG  images 
(derived  firom  the  presence  of  color)  is  similar  to  the  advantage  of  FC  over  FG  images. 
These  results  suggest  that  false  color  is  not  interfering  with  recognition;  otherwise  the 
difference  between  natural  conditions  (NG  -NC)  compared  to  the  difference  between 
artificial  conditions  (FG-FC)  should  have  been  significant. 

Figure  17  and  Figure  18  illustrate  the  Noise  by  Color  interaction.  Figure  17  shows 
how  as  the  level  of  noise  increases,  the  differences  in  RTs  for  each  level  of  noise  increase 
too,  with  faster  RTs  and  greater  accuracy  for  the  chromatic  images.  Apparently  there  is 
interaction  between  these  two  factors  for  the  dependent  variable  RT.  ANOVA  yields  these 
results:  Dependent  variable  RT  F(2,5674)  =  9.4622,  e.<  0.01;  dependent  variable  ER. 
F(2,285)  =  1.4343,  p  =  0.2399,  therefore  just  the  interaction  for  RT  is  significant. 
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yields  these  results:  Dependent  variable  RT  F(2,5674)  =  9.4622,  p.<  0.01;  dependent 
variable  ER:  F(2,285)  =  1.4343,  p  =  0.2399,  therefore  just  the  interaction  for  RT  is 
significant. 


Figure  15:  IVlean  reaction  times  (msec)  for  hue  and  color  conditions. 


Figure  16:  Mean  error  rates  (%)  for  hue  and  color  conditions. 


Figure  18:  Mean  error  rates  (%)  for  noise  and  color  conditions. 
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The  significance  of  this  interaction  for  the  dependent  variable  RT  shows  that  participants 
were  not  only  faster  with  chromatic  images  compared  to  their  achromatic  counterparts  but 
also  that  the  difference  between  chromatic  and  achromatic  stimuli  increased  as  the  levels 
of  noise  increased.  These  results  suggest  that  color  is  playing  a  role  in  object  recognition 
speeding  the  identification  of  the  stimuli  when  their  shape  is  degraded  and  that  the  more 
degraded  the  objects  are,  the  more  helpful  color  is.  As  noted  above,  the  absence  of  an 
interaction  between  level  of  hue  (natural  or  false)  and  level  of  color  (chromatic  or 
achromatic)  indicates  that  this  effect  is  the  result  of  facilitated  image  segmentation. 

Figure  19  and  Figure  20  illustrate  the  Noise  by  Hue  interaction.  Figure  19  shows 
how  the  lines  that  represent  mean  RTs  both  for  natural  and  false  images  across  the  three 
levels  of  noise,  remain  parallel  to  each  other.  Apparently  there  is  no  interaction  between 

V 

these  two  factors  for  the  variable  RT.Figure  20  shows  how  these  lines  slightly  diverge  for 
increasing  levels  of  noise,  indicating  a  possible  interaction  between  these  two  factors  for 
the  dependent  variable  ER.  ANOVA  yields  these  results:  Dependent  variable  RT 
F(2,5674)  =  0.3262,  £.=  0.7217;  dependent  variable  ER:  F(2,285)  =  2.7447,  g  =  0.0659, 
indicating  no  significance  of  the  hue  by  noise  interaction  for  any  of  the  dependent  variables 
RT  or  ER.  The  non-significance  of  this  interaction  for  both  dependent  variables  RT  and 
error  rates  shows  that  although  participants  were  faster  and  more  accurate  with  natural 
hue  images  than  with  their  artificial  counterparts,  the  difference  between  these  two 
formats  did  not  increase  with  increasing  levels  of  noise.  These  results  suggest  that  false 
color  is  not  interfering  with  recognition  as  the  level  of  noise  increases,  in  a  similar  way  as 
it  was  shown  for  the  Hue  and  Color  interaction. 
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Factorial  three-way  interaction  for  both  dependent  measures  resulted  non¬ 
significant  according  to  ANOVA  results:  dependent  variable  RT:  F(2,5674)  =  0. 1944,  p  = 
0.8233;  dependent  variable  ER:  F(2,285)  =  1.2618,  p  =  0.2847.  Based  on  these  results, 
the  second  a  priori  null  hypothesis  (no  interaction  of  NC  and  FC  conditions)  cannot  be 
rejected.  The  non-significance  of  this  interaction  shows  that  the  difference  between 
natural  hue  images  and  their  gray  scale  counterparts  (NG-NC)  and  between  false  hue 
images  and  their  respective  achromatic  counterparts  (FG-FC)  does  not  change 
significantly  when  the  levels  of  noise  change.  These  results  suggest  that  both  the 
beneficial  effects  of  natural  color  and  false  color  remain  similar  for  increasing  levels  of 
noise. 

Experiment  data  analysis  showed  that  RTs  for  color  stimuli  were  faster  compared 
to  gray  scale  images  and  that  this  effect  increased  when  the  level  of  noise  increased.  These 
results  suggest  that  both  natural  color  and  false  color  conditions  might  be  beneficial  in 
object  recognition.  The  advantage  of  using  natural  color  seems  to  be  similar  to  the 
advantage  that  is  obtained  when  artificial  color  is  used.  Therefore  false  color  does  not 
seem  to  be  disruptive  in  recognition  tasks.  Both  natural  and  false  color  stimuli  are  similarly 
helpful  at  different  levels  of  noise,  such  that  even  for  high  degradation  levels  false  color 
remains  non-disruptive  in  object  recognition.  All  these  results  also  suggest  that  participants 
are  not  using  color  to  recognize  the  objects  in  a  top-down  process.  They  are  just  fulfilling 
a  bottom-up  process  using  color  for  image  segmentation,  without  any  effect  of  the  level  of 
color  (natural  or  false)  or  of  the  level  of  noise.  If  they  were  using  stored  knowledge  of 


72 


color  to  recognize  objects,  the  advantage  of  using  natural  color  should  be  larger  than  the 
advantage  obtained  from  the  use  of  false  color,  and  this  is  not  the  case. 

It  should  be  recalled  that  FC  images  were  obtained  by  means  of  hue  manipulation 
of  the  original  NC  images.  The  significant  difference  of  performance  by  the  participants  at 
level  of  noise  0,  when  dealing  with  these  images,  could  have  been  originated  by  two  kinds 
of  reasons:  i)  false  color  is  disruptive  in  object  naming  tasks,  based  on  incongruency  with 
the  stored  knowledge  of  color;  ii)  changes  in  luminance  with  respect  to  the  NC  images, 
originated  during  hue  manipulation.  All  achromatic  stimuli,  both  with  natural  or  false  hue, 
were  obtained  from  their  chromatic  counterparts  without  introducing  any  change  in 
luminance  during  the  transformation.  Luminance  of  NC  stimuli  is  the  same  as  the  luminance 
of  NG  stimuli.  For  the  same  reason,  luminance  of  FC  stunuli  is  the  same  as  the  luminance 
of  the  FG  stimuli.  Therefore,  differences  in  responses  between  NG  and  NC  stimuli  or 
between  FG  and  FC  stimuli,  are  due  just  to  changes  in  color.  Also,  ANOVA  results  for 
hue,  color  and  noise  effects  showed  that  false  color  was  not  disruptive  during  the  naming 
task  conducted  in  this  experiment.  Thus,  differences  in  responses  between  NC  and  FC 

images  are  due  just  to  changes  in  luminance. 

NG  images  achieve  better  performances  than  their  FG  counterparts  for  all  different 
levels  of  noise  and  for  dependent  variable  RT.  The  results  of  these  two  chromatic 
conditions  were  expected  to  be  similar.  In  this  case,  these  different  results  cannot  be 
explained  based  on  differences  in  color,  given  that  both  conditions  are  achromatic.  These 
diverging  results  should  have  been  originated  then  by  changes  in  luminance,  possibly 
introduced  when  hue  was  manipulated,  based  on  the  similar  value  of  the  differences  with 
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their  chromatic  counterparts  (NG-NC  vs.  FG-FC),  both  for  RTs  and  error  rates,  and  based 
also  on  the  fact  that  gray  scale  images  were  obtained  from  manipulation  of  their  respective 
chromatic  counterparts,  just  eliminating  color.  Therefore,  differences  between  NC  and  FC 
stimuli  for  RTs  are  due  just  to  changes  in  luminance. 

There  is  a  possibility  that  could  explain  how  these  manipulations  in  hue  could 
affect  the  luminance  of  the  image.  Measures  of  luminance  for  NC  and  FC  stimuli  showed 
that  luminance  for  some  images  increased  and  for  others  decreased  when  the  change  of 
hue  was  conducted.  Although  the  mean  luminance  of  all  the  images  for  each  format  did 
not  change  (at  level  of  noise  0  mean  luminance  for  NC  images  was  150.06;  Sdev  14.86 
and  for  FC  was  149.46;  Sdev  13.79),  changes  in  luminance  could  have  affected  each 
image  in  a  different  way,  with  luminance  increasing  in  some  places  of  the  image  and 
decreasing  in  others.  This  could  have  left  mean  luminance  for  each  whole  image 
unchanged  but  would  have  changed  the  contrast  within  the  images.  Poorer  contrast  in  the 
vicinity  of  the  object’s  contour  could  have  made  object  recognition  more  difficult  to 
achieve. 

Paired  comparisons  using  Tukey’s  method  were  conducted  at  each  level  of  noise, 
and  for  both  dependent  variables  RT  and  error  rates,  among  the  different  color  conditions. 
These  results  are  represented  graphically  in  Table  7.  Underlined  pairs  are  non-significant. 
The  numeric  results  of  these  comparisons  can  be  seen  in  Tables  8  and  9.  These  tables 
show  how,  at  each  level  of  noise,  participants  were  faster  and  more  accurate  with  NC 
stimuU  than  with  any  other  condition. 
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Based  on  these  results  the  first  null  hypothesis  can  be  rejected.  Differences  in 
RTs  between  NC  and  FC  stimuli  resulted  significant  at  level  of  noise  0  but  there  were 
non-significant  at  other  levels  of  noise  and  the  difference  between  them  did  not  increase 
as  the  levels  of  noise  increased  so,  the  second  null  hypothesis  cannot  be  rejected.  The 
third  and  fourth  null  hypotheses  can  be  rejected  based  on  the  facts  that  gray  scale  images 
achieved  the  longest  RTs  and  greatest  error  rates  at  each  level  of  noise,  and  the  FC 
images  achieved  intermediate  results,  as  it  was  hypothesized.  For  the  dependent  variable 
error  rates,  just  NC  stimuli  resulted  significantly  different  from  FG  stimuli  at  level  of 
noise  0,  and  level  of  noise  2. 


RTs  Paired  Comparisons 


Noise  Level  0 

NC 

NG 

FG 

FC 

Noise  Level  1 

NC 

NG 

FC 

FG 

Noise  Level  2 

NC 

FC 

NG 

FG 

ERs  Paired  Comparisons 

Noise  Level  0 

NC 

FC 

NG 

FG 

Noise  Level  1 

NC 

NG 

FC 

FG 

Noise  Level  2 

NC 

NG 

FC 

FG 

Table  7:  Tukey’s  Method  Paired  Comparisons. 
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Noise  level  0 

FC 

FG 

NG 

NC 

FC 

X 

0.000019 

0.000061  (SIG) 

0.000066  (SIG) 

FG 

0.000019 

X 

0.000042  (SIG) 

0.000047  (SIG) 

NG 

0.000061  (SIG) 

0.000042  (SIG) 

X 

0.000005 

NC 

0.000066  (SIG) 

0.000047  (SIG) 

0.000005 

X 

W=0.000041 

Noise  level  1 

FG 

FC 

NG 

NC 

FG 

X 

0.000031 

0.000036 

0.000070  (SIG) 

FC 

0.000031 

X 

0.000005 

0.000039 

NG 

0.000036 

0.000005 

X 

0.000034 

NC 

0.000070  (SIG) 

0.000039 

0.000034 

X 

W=0.000043 

Noise  level  2 

FG 

NG 

FC 

NC 

FG 

X 

0.000044 

0.000067  (SIG) 

0.000112  (SIG) 

NG  . 

0.000044 

X 

0.000023 

0.000068  (SIG) 

FC 

0.000067  (SIG) 

0.000023 

X 

0.000045 

NC 

0.000112  (SIG) 

0.000068  (SIG) 

0.000045 

X 

W=0.000048 

Table  8:  Mean  reaction  times  paired  comparisons  for  each  level  of  noise  and  color 

format  (Transformed  data). 


Noise  level  0 

FG 

NG 

FC 

NC 

FG 

X 

0.390 

0.713 

0.856  (SIG) 

NG 

0.390 

X 

0.323 

0.466 

FC 

0.713 

0.323 

X 

0.143 

NC 

0.856  (SIG) 

0.466 

0.143 

X 

V/=0.829 

Noise  level  1 

FG 

FC 

NG 

NC 

FG 

X 

0.212 

0.269 

0.739 

FC 

0.212 

X 

0.527 

0.269 

0.057 

X 

0.470 

0.739 

0.527 

0.470 

X 

W=1.134 

Noise  level  2 

FG 

FC 

[ng 

NC 

FG 

X 

0.363 

6.394 

1.501  (SIG) 

FC 

0.363 

X 

1.138 

0.394 

0.031 

X 

1.107 

1.501  (SIG) 

1.138 

1.107 

X 

W=1.290 

Table  9:  Mean  error  rates  paired  comparisons  for  each  level  of  noise  and  color 

format  (Transformed  data). 
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In  order  to  investigate  a  possible  learning  effect  caused  by  the  use  of  the  first 
session  of  the  experiment  as  training  for  the  second  one,  the  independent  variable  Block 
was  included  in  the  model,  with  two  levels.  Sessions  1  and  2,  looking  for  a  significant 
difference  between  the  results  of  both  sessions.  A  greater  learning  effect  for  the  FC 
conditions  could  have  obscured  the  assumed  detrimental  effects  of  FC  in  object 
recognition  by  decreasing  RTs  and  error  rates  during  the  second  session.  This  effect  could 
be  based  on  the  knowledge  achieved  by  the  participants  during  the  first  session  of  the 
experiment.  Main  effect  of  Block  and  interactions  with  Hue  and  Color  were  therefore 
analyzed  for  both  RTs  and  error  rates.  ANOVA  yielded  these  results:  significant  main 
effect  for  Block,  on  the  dependent  variable  RT:  F(l,5674)  =  67.9391,  e.<  0.01;  and  on 
the  dependent  variable  ER:  F(l,285)  =  15.3 184,  p  <  0.01.  None  of  the  interactions  were 
significant  for  any  of  the  dependent  variables.  These  were  the  results  for  the  interactions: 
Hue  by  Block  interaction  for  RT:  F(l,5674)  =  0.6212,  p=  0.4306;  for  error  rate  F(l,285) 
=  0.00263,  p=  0.9591.  Color  by  Block  interaction  for  RT:  F(l,5674)  =  0.0653,  p  = 
0.7983;  for  error  rate  F(l,285)  =  0.5874,  p  =0.4441 .  These  results  suggest  that 
participant  responses  were  faster  and  more  accurate  during  the  second  session,  possibly 
caused  by  a  learning  effect  during  the  first  session  of  the  experiment.  But  they  were  not 
significantly  faster  at  any  specific  color  condition,  therefore  a  greater  learning  effect,  not 
just  for  FC  but  for  any  other  particular  color  condition,  could  not  be  proven.  Learning 
effect  was  similar  for  every  color  condition. 
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IV.  CONCLUSIONS 


This  experiment  examined  the  role  of  natural  and  false  color  in  an  object 
recognition  task,  with  degraded  and  non-degraded  images,  focusing  on  improving  night 
vision  devices  that  employ  the  new  technology  of  color  fusion  displays. 

Four  hypotheses  were  stated  in  the  introduction  that  summarized  several 
assumptions  based  bn  previous  research  about  the  role  of  color  in  object  recognition.  The 
experimental  design  was  employed  to  explore  dependent  measures  as  reaction  time  and 
accuracy  in  object  recognition,  critical  factors  when  accomplishing  military  missions  in 
which  night  vision  devices  are  involved. 

The  results  and  discussion  presented  in  the  previous  chapter  supported  rejecting 
all  but  the  second  null  hypothesis.  First,  third  and  fourth  null  hypotheses  were  rejected, 
based  on  data  analysis  that  showed  how  natural  color  images  achieved  the  best 
performance  at  every  level  of  image  degradation;  achromatic  images  achieved  longest 
RTs  and  largest  error  rates,  and  false  color  stimuli  reached  an  intermediate  level  of 
performance  between  these  two  groups  of  stimuli.  There  was  a  failure  to  reject  that 
differences  in  RTs  between  natural  color  images  and  their  false  color  counterparts 
increased  for  increasing  levels  of  image  degradation.  These  results  are  summarized  in 
Table  10. 

Data  analysis  suggest  that  differences  in  performance  between  natural  and  false 
hue  stimuli  were  due  to  differences  in  luminance  and  not  to  chromatic  differences  in  such 
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Null  Hypothesis 

Result 

Conclusion 

No  differences  in  RTs  or  ERs 
between  natural  color  stimuli 
and  other  color  formats,  across 
all  levels  of  noise 

Reject  Null 
Hypothesis 

Shorter  RTs  and  smaller  ERs 
within  natural  color  stimuli 
across  all  levels  of  noise 

No  increasing  differences  in  RTs 
or  ERs  between  natural  color 
and  false  color  stimuli  for 
increasing  levels  of  noise 

NOT  reject 
Null 

Hypothesis 

(Null  hypothesis) 

No  differences  in  RTs  or  ERs 
between  chromatic  and 
achromatic  stimuli  across  all 
levels  of  noise 

Reject  Null 
Hypothesis 

Longest  RTs  and  greatest  ERs 
within  the  achromatic  stimuli 
across  all  levels  of  noise 

No  differences  in  RTs  or  ERs 
between  natural  hue  and  false 
hue  stimuli  across  all  levels  of 
noise. 

Reject  Null 
Hypothesis 

Intermediate  results  for  false 
hue  stimuli  across  all  levels  of 
noise. 

Table  10:  Summary  of  the  results. 


a  way  that  if  two  images  (NC  and  FC  conditions)  were  matched  for  luminance,  it  should 
be  inconsequential  whether  the  hues  were  natural  or  false. 

As  a  result  of  the  analysis  conducted  trying  to  assess  the  benefit  of  using  color  in 
object  recognition,  it  can  be  concluded  that  both  natural  and  false  hue  conditions  resulted 
equally  beneficial  in  the  task  accomplished  during  the  experiment.  There  was  no 
evidence  of  false  color  as  a  disruptive  factor  during  this  task,  and  both  natural  and  false 
hue  were  similarly  useful  at  different  levels  of  image  degradation.  Thus,  results  indicate 
that  participants  conducted  a  bottom-up  process  during  the  object  recognition  task, 
making  use  of  color  (natural  or  false)  only  to  achieve  image  segmentation.  These 
findings  are  consistent  with  Wurm  &  Legge  (1993),  and  Biederman  &  Ju  (1988)  views 
that  primary  access  to  object  recognition  uses  stmctural  (geometrical)  representation  of 
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representation  is  in  part  generated  by  the  presence  of  color.  Participants  did  not  use  color 
to  access  stored  knowledge  of  the  object’s  psychological  representation.  If  participants 
were  taking  advantage  of  natural  color  to  achieve  object  recognition,  the  benefit  of  natural 
color  should  have  been  larger  than  the  benefit  of  false  color,  because  they  would  be  able  to 
fiilfill  a  top-down  and  bottom-up  recognition  processes  simultaneously. 

More  research  must  still  be  done  in  the  field  of  human  nighttime  visual 
performance,  based  on  the  fact  already  stated  that  the  benefits  of  integrating  synthetic 
color  to  fused  imagery  is  dependent  on  the  color  algorithm  used,  the  visual  task 
performed,  and  scene  content  (Steele  &  Perconti,  1997).  Future  research  should  study  the 
benefits  of  using  false  color  in  each  of  these  scenarios. 

The  results  of  this  study  give  an  indication  that  false  color  rnay  be  useful  in  future 
color  fusion  devices  based  on  its  facilitation  of  image  segmentation  with  shape  degraded 
images.  Although  this  study  was  far  fi’om  covering  all  different  scenarios  that  may  appear 
Airing  a  nighttime  military  operation,  it  shows  as  plausible  to  consider  the  use  of  synthetic 
color  in  the  development  of  new  military  night  vision  devices. 
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